## Table

The [Table](https://hail.is/docs/devel/hail.Table.html) is Hail's distributed version of a DataFrame.  It will be familiar if you've used R or `pandas`, but differs in 3 important ways:

- It is distributed.  Hail `Table`s can store far more data than can fit one a single computer.
- It carries global fields.
- It is keyed.

Like `MatrixTable`, `Table` have different kinds of fields. There are two:

 - global fields
 - row fields

## Importing and Reading

Hail can *import* data from many sources: TSV and CSV files, JSON files, FAM files, databases, Spark, etc.  It can also *read* (and *write*) a native Hail format.

We've prepared a dataset of movie ratings in the Hail native format.  Let's read it!

## Read

You can read a dataset with [hl.read_table](https://hail.is/docs/devel/methods/impex.html#hail.methods.read_table).  It take a path and returns a `Table`.  `ht` stands for Hail Table.

In [None]:
import hail as hl
hl.init()

In [None]:
hl.utils.get_movie_lens('data/')

In [None]:
users = hl.read_table('data/users.ht')
users

## Exploring Tables

That wasn't very informative!  How do we look at the structure of a `Table`?  The simplest is [describe](https://hail.is/docs/devel/hail.Table.html#hail.Table.describe), which shows the fields and their types.

In [None]:
users.describe()

`describe` tells us the structure of the table, but what about its contents?  To show the first few rows of the table, use [show](https://hail.is/docs/devel/hail.Table.html#hail.Table.show).

In [None]:
users.show()

You can use [count](https://hail.is/docs/devel/hail.Table.html#hail.Table.count) to count the rows of the table.

In [None]:
users.count()

You can access fields of tables with the Python attribute (dot) notation, or index notation (brackets).  The latter is useful when the field names are not valid Python identifiers.

In [None]:
users.occupation.describe()

In [None]:
users['occupation'].describe()

You can also `show` an `Expression`. Notice that the key is shown as well.

In [None]:
users.occupation.show()

## Exercise

The movie dataset has two other tables: `movies.ht` and `ratings.ht`.  Load these tables and have a quick look around.