Skip to content

Language Data Models

andychu edited this page Jan 3, 2017 · 3 revisions

Based on "The Memory Models That Underlie Programming Languages" and my comments there.

https://news.ycombinator.com/item?id=13293290

The Models Each Language Uses

  • C and C++: typed object graph, COBOL records, homogeneous arrays -- hm this is the fullest model.
  • Go: like C and C++ but with GC
  • Java: typed object graph + homogeneous arrays (people sort of viewed this as a "wart", but it looks good in retrospect)
  • JavaScript: untyped object graph. ES5 or ES6 added homogeneous arrays!
  • Python: untyped object graph only! Pandas adds homogeneous arrays. (Values have types, but the composites don't have types)
  • Lua: same as Python, but no Pandas
  • Perl: same as Python
  • shell: POSIX shell has no model -- just strings. Bash has half an array type and an associative array type which are sort of like an untyped COBOL records system. (He mentions the hierarchical file system as a model, which is shell's answer for more structured data, but that proves it doesn't belong in the taxonomy)
  • OCaml: typed object graph. It might have arrays, but typed linked lists are more central (and same with SML)
  • Scientific Languages
    • R: untyped object graph + homogeneous arrays, tables of heterogeneous columns
    • Julia/Matlab: matrices (multidimensional homogeneous arrays) are the core.
    • Wolfram/Mathematica: symbols?
  • Jai language: like C, but has array-of-structures and structure-of-arrays flexibility. Logical vs. Physical data model.

Data Formats

  • JSON
  • Protocol Buffers. NOTE: Added both maps and unions (could be mapped to sum types)
  • ASDL

Systems that Persist Data

  • SQL: relational model persisted
    • concrete storage: could be row-oriented or column oriented
  • File system
  • Redis: dicts, lists, sets, counters (I think)
Clone this wiki locally