# What is data? 

Probably if you ask this question to three different people, you'll get five different answers! Rather than trying to come up with a definition, let's just look at some examples.

## Yesterday I ate tomatoes

Suppose I decide to keep a diary about the food I eat. This could be pretty informal:

```
Monday
------
bfast: toast and jam
lunch: tomato soup and roll
supper: baked beans, sushi, treacle tart

Tuesday
-------
bfast: porridge with soya milk
lunch: tomato soup and roll
supper: peri-peri chicken, chips, coke

```

But it's still good enough as a record of my diet. (Not enough greens?)

## I run

Here's a slightly different kind of diary, recording my running exploits in the first half of December:

```
5/12/15 4.5km
7/12/15 3.1km
12/12/15 8.6km

```

## Just numbers

What about this?

```
23.87
19.85
19.22
28.93
29.41
22.23
23.50
24.95
```

It's just a list of numbers, right? Apart from the fact that the numbers are in a narrow range, it's pretty much impossible to guess what this information is about.

Here are the same numbers, but with more information added:

```
Year   Days of rainfall
-----------------------
2004        23.87
2005        19.85
2006        19.22
2007        28.93
2008        29.41
2009        22.23
2010        23.50
2011        24.95
```

So now we see that we have got a *time series*: a sequence of data points measured at different times &mdash; in this case, in successive years. The two columns have been given labels which tell us what the time points are, and what kind of quantity has been measured. We could also specify not just *when* but *where* the measurements were taken, namely in Edinburgh.

The bits of information which tell us things like dates, location, the kind of quantity etc is sometimes called *metadata*: it's data about data.

### Your turn

* Find another example of time series data. Find or make-up some data points that are part of the series.

* Find another example of simple numerical data which is *not* time series data. What metadata would you need to add to make sure that someone else understands the data?

## Turning Tables

We often represent data in the form of rows and columns. That's what we mean when we talk about a data table (or tabular data). So the rainfall data above had two columns and eight rows, plus a header row.

### Your turn

* Write down the food diary example so that it looks like a table. 

Public bodies collect *lots* of data about all manners of things. More and more, they have been making this available as [open data](https://en.wikipedia.org/wiki/Open_data ) to anyone that wants to use it. Most of the time, the data is provided as some kind of table that we can download over the internet. Here's an example of data about Scottish schools which we've already downloaded for you. We're doing a bit of extra magic to make it easy to display the data, but you can ignore this for the time being.

In [2]:
import pandas as pd # This is an import statement.
gov_csv = pd.read_csv("../../data/opendatascotland/tutorial_1/schools.csv")
# Display the tabular data 
gov_csv

Unnamed: 0,school,school_label,latitude,longitude,pupils
0,http://data.opendatascotland.org/id/educationa...,Linlithgow Academy,55.97160,-3.61259,1231
1,http://data.opendatascotland.org/id/educationa...,St Kentigern's Academy,55.87101,-3.63367,1215
2,http://data.opendatascotland.org/id/educationa...,"James Young High,The",55.88093,-3.51523,1135
3,http://data.opendatascotland.org/id/educationa...,St Margaret's Academy,55.88937,-3.52213,1094
4,http://data.opendatascotland.org/id/educationa...,Inveralmond Community High,55.90146,-3.51932,1090
5,http://data.opendatascotland.org/id/educationa...,West Calder High,55.86291,-3.54044,950
6,http://data.opendatascotland.org/id/educationa...,Deans Community High,55.90581,-3.54977,941
7,http://data.opendatascotland.org/id/educationa...,Broxburn Academy,55.93694,-3.48778,903
8,http://data.opendatascotland.org/id/educationa...,Bathgate Academy,55.89838,-3.61313,899
9,http://data.opendatascotland.org/id/educationa...,Whitburn Academy,55.86804,-3.67964,822
