# Exploring Data with Python using Jupyter Notebooks¶

### Objectives
    By the end of this tutorial, you'll be able to:
    - install Anaconda
    - launch, run, and modify a Jupyter notebook
    - use Python and Pandas to analyse a dataset of Hotel Receipts




## Installation
### Conda

The quick way to setup a Python enviroment for data anaylsis is to use the cross-platform package manager conda from Continuum Analytics. First download and install miniconda http://conda.pydata.org/miniconda.html.

These Miniconda installers contain the conda package manager and Python. Once Miniconda is installed, you can use the conda command to install any other packages and create environments, etc. For example: to install the required libraries for these notebooks, simply run:

    $ conda install ipython ipython-notebook pandas numpy scipy sympy matplotlib cython

This should be sufficient to get a working environment on any platform supported by conda.

### Linux

In Ubuntu Linux,  if you don't want to use Conda then to install python and all the requirements run:

    $ sudo apt-get install python ipython ipython-notebook
    $ sudo apt-get install python-numpy python-scipy python-matplotlib pandas



### Windows

Windows lacks a good packaging system, so the easiest way to setup a Python environment is to install a pre-packaged distribution like Anacaonda by either installing miniconda (as mentioned above) or downloading Anaconda installer (Easier option). See: https://www.continuum.io/downloads
    
## Getting started

-  Go to https://github.com/gulahmed/Python-Juypter-and-Data.git
-  Click "Download ZIP" 
-- Save and unzip the file (and take note of where you saved it).

Load our starter code using Jupyter:

    Launch Jupyter. There are two ways to do this:
        If you're confortable in the terminal, run the command jupyter notebook
        Otherwise, use Anaconda Navigator to launch Jupyter.
    Using the Jupyter window that opens in your browser, navigate to where you saved & unzipped the starter code.
    Click on THIS notebook to open it. You should now see this notebook on your own computer screen!


### resources
 
 
 *  Python 3 (https://www.python.org/ )
 *  matplotlib ( https://matplotlib.org/ )
 *  pandas ( http://pandas.pydata.org/ )
 *  numpy (http://www.numpy.org/ )
 


### Import the libraries we'll need. 
In python, you use the import statement to load libraries into your script.

## Exploring Series




In [9]:
import pandas as pd 
import numpy as np


In [10]:
# Create a series
pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])

a    0.802242
b   -0.166491
c    1.387754
d    0.326318
e   -0.757072
dtype: float64

In [11]:
# Preserve the series in variable s for later access
s = pd.Series(np.random.randn(5))
print(s)

0   -1.443499
1    0.177829
2    1.328906
3    1.740811
4    0.735915
dtype: float64


In [12]:
# Index
print (s[0])
print("\n")
print (s[:3])

-1.44349921352


0   -1.443499
1    0.177829
2    1.328906
dtype: float64



### Exploring DataFrames¶


In [13]:
# Create a dataframe
df = pd.DataFrame(s, columns = ['Column 1'])
df

Unnamed: 0,Column 1
0,-1.443499
1,0.177829
2,1.328906
3,1.740811
4,0.735915


In [None]:
# Can access columns by name
df['Column 1']

In [None]:
# Easy to add columns
df['Column 2'] = df['Column 1'] * 4
df

In [None]:


# Other manipulation, like sorting -- if you want to preserve, set equal to a var
df.sort_values(by = 'Column 2')



In [14]:
# Boolean indexing
df[df['Column 2'] <= 2]

KeyError: 'Column 2'

### example usage
Lets grab Stock Data With Yahoo Finance Using Pandas Datareader Package

In [2]:
# matplotlib is used for : plotting!
import matplotlib.pyplot as plt

# Pandas is the "Python Data Analysis Library". We use it for loading data from disk and
# for manipulating and printing data.
# from pandas.datareader import data, wb
import pandas_datareader.data as web
import datetime as dt


# This next line is a Jupyter directive. It tells Jupyter that we want our plots to show
# up right below the code that creates them.
%matplotlib inline


In [7]:
# define start and end date
start = dt.datetime(2016, 1, 1)
end = dt.datetime(2017, 4, 18)
df = web.DataReader("AAPL", 'yahoo', start, end) # stock code, search method, start and end time
df.ix()

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2016-01-04,102.610001,105.370003,102.0,105.349998,67649400,102.612183
2016-01-05,105.75,105.849998,102.410004,102.709999,55791000,100.040792
2016-01-06,100.559998,102.370003,99.870003,100.699997,68457400,98.083025
2016-01-07,98.68,100.129997,96.43,96.449997,81094400,93.943473
2016-01-08,98.550003,99.110001,96.760002,96.959999,70798000,94.440222


In [None]:
from pandas_datareader import data, wb
import datetime as dt
