## Importing data

We will be working with some files I pulled from the Office of National Statistics (ONS) and some artificially generated customer data. Please download these files from [here](https://www.dropbox.com/sh/73n4slafnpjfcdb/AAA-aPDbgI-RjpJraoY4eBOua?dl=0) and save them in a local folder on your machine. Later you will repeat the process with your own data.

The first thing we need to do is to import the pandas library and set a few constants.

In [None]:
import pandas as pd
working_dir  = "../data/"

Note the way we have created an alias for pandas `pd` which we will use whenever we wish to refer to it. Substitute in your own directory details above. 

Importing csv files is very simple with pandas. For example:

In [None]:
household_size = pd.read_csv(working_dir + 'HouseholdSize.csv', encoding = 'ISO-8859-1')

The encoding ensures we can import stray characters that are outside of standard unicode encodings. Use the head method to check that the file has imported ok.

In [None]:
household_size.head()

Note pandas gives us a wide range of options for importing from csv. Have a look at [the documentation.](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html) `nrows` is particularly useful as is `skiprows`

In [None]:
household_size_first_20 = pd.read_csv(working_dir + 'HouseholdSize.csv', encoding = 'ISO-8859-1', nrows = 20)

Now import the remaining csv files.

We can also import data directly from Excel files. But you will need to install `xlrd`

In [None]:
religion_detailed = pd.read_excel(working_dir + 'ReligionDetailed.xlsx', header=1)
religion_detailed.head()

Interaction with web APIs is also very easy with python

In [None]:
import urllib.request, json
get_data = urllib.request.urlopen("http://jsonplaceholder.typicode.com/users").read().decode('utf-8')
data = json.loads(get_data)
json_df = pd.DataFrame(data)
json_df.head()