In [1]:
import pandas as pd

Read in the stock data from the file `data/closing-prices.csv`)

In [None]:
prices = pd.read_csv('./data/closing-prices.csv')

Look at the first few rows of the data to determine its structure

In [3]:
prices.head()

Unnamed: 0.1,Unnamed: 0,F,TSLA,GOOG,IBM,AAPL
0,2014-01-02,12.089,150.1,,157.6001,72.7741
1,2014-01-03,12.1438,149.56,,158.543,71.1756
2,2014-01-06,12.1986,147.0,,157.9993,71.5637
3,2014-01-07,12.042,149.36,,161.1508,71.0516
4,2014-01-08,12.1673,151.28,,159.6728,71.5019


Re-read the prices file, with the following changes:

- Parse column 0 as a date
- Use column 0 as the index of the dataframe

In [4]:
prices = pd.read_csv('./data/closing-prices.csv', index_col=0, parse_dates=[0])

Look at the first few rows of the data to verify its structure

In [5]:
prices.head()

Unnamed: 0,F,TSLA,GOOG,IBM,AAPL
2014-01-02,12.089,150.1,,157.6001,72.7741
2014-01-03,12.1438,149.56,,158.543,71.1756
2014-01-06,12.1986,147.0,,157.9993,71.5637
2014-01-07,12.042,149.36,,161.1508,71.0516
2014-01-08,12.1673,151.28,,159.6728,71.5019


Use the `.info()` method to see how much data and what type is in each column

In [6]:
prices.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 1007 entries, 2014-01-02 to 2017-12-29
Data columns (total 5 columns):
F       1007 non-null float64
TSLA    1007 non-null float64
GOOG    949 non-null float64
IBM     1007 non-null float64
AAPL    1007 non-null float64
dtypes: float64(5)
memory usage: 47.2 KB


Using the following sqlite connection, save the data as a new table 'stocks'

In [7]:
import sqlite3
con = sqlite3.connect('./data/stocks.db')

In [8]:
prices.to_sql('stocks', con)

Use the following SQL statement to create a new dataframe:

```sql
SELECT [index], IBM, AAPL 
FROM stocks
WHERE IBM > 160
```

(make sure you set the 'index' column as your index, and parse it as a date)

In [None]:
query = '''
SELECT [index], IBM, AAPL 
FROM stocks
WHERE IBM > 160'''
df = pd.read_sql(query, con, index_col='index', parse_dates=['index'])

Examine the first few rows of the dataframe and use the `.info()` method to verify its structure

In [16]:
df.head()

Unnamed: 0_level_0,IBM,AAPL
index,Unnamed: 1_level_1,Unnamed: 2_level_1
2014-01-07,161.1508,71.0516
2014-01-16,160.3438,72.9215
2014-01-17,161.4736,71.1348
2014-01-21,160.0635,72.24
2014-03-06,160.2662,70.2476


In [17]:
df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 177 entries, 2014-01-07 to 2017-04-17
Data columns (total 2 columns):
IBM     177 non-null float64
AAPL    177 non-null float64
dtypes: float64(2)
memory usage: 4.1 KB
