# Loading and working with data

By the end of the tutorials of this week, you should acquire:

**Knowledge on:**
* What a Python module is
* Working with tabular data
* Creating and loading dataframes
* Explorating unknown datasets
* Slicing datasets


**Skills:**
* Loading pandas
* Creating a dataframe based on dictionary
* Loading a CSV
* The commands: .columns, .describe(), .value_counts() and .groupby()
* Slicing a dataframe based on column



## Extending Python

Python has a several modules, packages or libraries that have been created by other developers which can help us with our work. 

For example, say we want to calculate the difference between two dates (in days). We could create a function to do so...

In [1]:
def difference_dates(date1,date2):
    delta = date2 - date1
    return delta

In [2]:
date1 = '2016-02-1'
date2 = '2016-04-15'

In [3]:
difference_dates(date1,date2)

TypeError: unsupported operand type(s) for -: 'str' and 'str'

In [4]:
timedelta

NameError: name 'timedelta' is not defined

Or... we could import a module that already exists, called datetime, and use its timedelta function

In [3]:
from datetime import timedelta, date

In [4]:
timedelta

datetime.timedelta

To find out how such modules work and what is required for each function you can have a look at their documentation [(link to datetime documentation)](https://docs.python.org/3/library/datetime.html). In the documentation we see that datetime.date has three attributes: year, month, day.

In [7]:
a = date(2016,2,1)
b = date(2016,4,15)

In [8]:
a

datetime.date(2016, 2, 1)

In [9]:
delta = b - a

In [10]:
delta

datetime.timedelta(74)

In [11]:
delta.days

74

## Installing new packages

### Package installers
Sometimes what we want to do was created by a developer, but is not available in our own machine. In these cases, we will *install* the package locally.

The easiest way to install a package is to use ```pip```. ```pip``` is the official installer for Python packages. You can run it directly from the netbook by using ```!```. For example, running ```! pip install seaborn``` in a Python cell will install the package seaborn (that we will use later in the class).

**Note:** In as you go into more advanced discussions about Python, you will see the importance of at times creating specific environments. Given the introductory nature of this course, we won't cover this here.

### How to find packages

How do I find these extensions (libraries, modules etc.)? I usually Google :-)

To be on the safe side:
* **Only** install packages via pip
* Do not download and run packages manually on your computer (unless you know what you are doing, and you trust the source)
* Do not use the --forge option, as these tend to be packages still in development (unless, of course, you are sure of what you are doing).
