<a href="https://colab.research.google.com/github/michalastocki/data-science-bootcamp/blob/master/02_analiza_danych/03_zapis_odczyt.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

* @author: krakowiakpawel9@gmail.com  
* @site: e-smartdata.org

### Pandas
>Strona biblioteki: [https://pandas.pydata.org/](https://pandas.pydata.org/)  
>Dokumentacja: [https://pandas.pydata.org/pandas-docs/stable/](https://pandas.pydata.org/pandas-docs/stable/)
>
>Podstawowa biblioteka do analizy danych w języku Python.
>
>Aby zainstalować bibliotekę Pandas użyj polecenia poniżej:
```
pip install pandas
```
### Spis treści:
1. [Import bibliotek](#a1)
2. [Załadowanie danych](#a2)
3. [Zapis/Odczyt danych z pliku CSV](#a3)
4. [London Bike Dataset](#a4)

### <a name='a1'></a> Import bibliotek

In [None]:
import numpy as np
import pandas as pd

### <a name='a2'></a> Załadowanie danych

In [None]:
def fetch_financial_data(company='AMZN'):
    """
    This function fetch stock market quotations.
    """
    import pandas_datareader.data as web
    return web.DataReader(name=company, data_source='stooq')

df = fetch_financial_data('FB')
df.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-12-10,201.66,202.05,200.15,200.87,9485568
2019-12-09,200.65,203.142,200.21,201.34,12013218
2019-12-06,200.5,201.57,200.06,201.05,12279525
2019-12-05,199.86,201.29,198.213,199.36,9755350
2019-12-04,200.0,200.029,198.05,198.71,8459939


### <a name='a3'></a> Zapis/Odczyt danych z pliku CSV

In [None]:
df.to_csv('fb.csv')

In [None]:
df_nov = df[(df.index.month == 11) & (df.index.year == 2019)]
df_nov

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-11-29,201.6,203.8,201.21,201.64,7985231
2019-11-27,199.9,203.14,199.42,202.0,12729462
2019-11-26,200.0,200.15,198.039,198.97,11748664
2019-11-25,199.515,200.97,199.25,199.79,15286442
2019-11-22,198.38,199.3,197.62,198.82,9959817
2019-11-21,197.42,199.09,196.86,197.93,12130985
2019-11-20,198.58,199.59,195.43,197.51,12370240
2019-11-19,197.4,200.0,196.86,199.32,19070291
2019-11-18,194.56,198.63,193.05,197.4,16176107
2019-11-15,194.26,195.3,193.38,195.1,11530232


In [None]:
df_nov.to_csv('fb_nov.csv')

In [None]:
new_df = pd.read_csv('fb_nov.csv', index_col=0)
new_df

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-11-29,201.6,203.8,201.21,201.64,7985231
2019-11-27,199.9,203.14,199.42,202.0,12729462
2019-11-26,200.0,200.15,198.039,198.97,11748664
2019-11-25,199.515,200.97,199.25,199.79,15286442
2019-11-22,198.38,199.3,197.62,198.82,9959817
2019-11-21,197.42,199.09,196.86,197.93,12130985
2019-11-20,198.58,199.59,195.43,197.51,12370240
2019-11-19,197.4,200.0,196.86,199.32,19070291
2019-11-18,194.56,198.63,193.05,197.4,16176107
2019-11-15,194.26,195.3,193.38,195.1,11530232


In [None]:
df_nov.to_excel('fb_nov.xlsx')

In [None]:
new_df = pd.read_excel('fb_nov.xlsx', index_col=0)
new_df

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-11-29,201.6,203.8,201.21,201.64,7985231
2019-11-27,199.9,203.14,199.42,202.0,12729462
2019-11-26,200.0,200.15,198.039,198.97,11748664
2019-11-25,199.515,200.97,199.25,199.79,15286442
2019-11-22,198.38,199.3,197.62,198.82,9959817
2019-11-21,197.42,199.09,196.86,197.93,12130985
2019-11-20,198.58,199.59,195.43,197.51,12370240
2019-11-19,197.4,200.0,196.86,199.32,19070291
2019-11-18,194.56,198.63,193.05,197.4,16176107
2019-11-15,194.26,195.3,193.38,195.1,11530232


### <a name='a4'></a> London Bike Dataset

In [None]:
df = pd.read_csv('london_bike.csv')
df.head()

Unnamed: 0,timestamp,cnt,t1,t2,hum,wind_speed,weather_code,is_holiday,is_weekend,season
0,2015-01-04 00:00:00,182,3.0,2.0,93.0,6.0,3.0,0.0,1.0,3.0
1,2015-01-04 01:00:00,138,3.0,2.5,93.0,5.0,1.0,0.0,1.0,3.0
2,2015-01-04 02:00:00,134,2.5,2.5,96.5,0.0,1.0,0.0,1.0,3.0
3,2015-01-04 03:00:00,72,2.0,2.0,100.0,0.0,1.0,0.0,1.0,3.0
4,2015-01-04 04:00:00,47,2.0,0.0,93.0,6.5,1.0,0.0,1.0,3.0


In [None]:
df = df.set_index('timestamp')
df.head()

Unnamed: 0_level_0,cnt,t1,t2,hum,wind_speed,weather_code,is_holiday,is_weekend,season
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2015-01-04 00:00:00,182,3.0,2.0,93.0,6.0,3.0,0.0,1.0,3.0
2015-01-04 01:00:00,138,3.0,2.5,93.0,5.0,1.0,0.0,1.0,3.0
2015-01-04 02:00:00,134,2.5,2.5,96.5,0.0,1.0,0.0,1.0,3.0
2015-01-04 03:00:00,72,2.0,2.0,100.0,0.0,1.0,0.0,1.0,3.0
2015-01-04 04:00:00,47,2.0,0.0,93.0,6.5,1.0,0.0,1.0,3.0


In [None]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 17414 entries, 0 to 17413
Data columns (total 10 columns):
timestamp       17414 non-null object
cnt             17414 non-null int64
t1              17414 non-null float64
t2              17414 non-null float64
hum             17414 non-null float64
wind_speed      17414 non-null float64
weather_code    17414 non-null float64
is_holiday      17414 non-null float64
is_weekend      17414 non-null float64
season          17414 non-null float64
dtypes: float64(8), int64(1), object(1)
memory usage: 1.3+ MB
