<a href="https://colab.research.google.com/github/MartaSolarz/Data_science_theory/blob/main/02_pandas/03_zapis_odczyt.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

* @author: krakowiakpawel9@gmail.com  
* @site: e-smartdata.org

### Pandas
>Strona biblioteki: [https://pandas.pydata.org/](https://pandas.pydata.org/)  
>Dokumentacja: [https://pandas.pydata.org/pandas-docs/stable/](https://pandas.pydata.org/pandas-docs/stable/)
>
>Podstawowa biblioteka do analizy danych w języku Python.
>
>Aby zainstalować bibliotekę Pandas użyj polecenia poniżej:
```
pip install pandas
```
### Spis treści:
1. [Import bibliotek](#a1)
2. [Załadowanie danych](#a2)
3. [Zapis/Odczyt danych z pliku CSV](#a3)
4. [London Bike Dataset](#a4)

### <a name='a1'></a> Import bibliotek

In [1]:
import numpy as np
import pandas as pd

### <a name='a2'></a> Załadowanie danych

In [3]:
def fetch_financial_data(company='AMZN'):
    """
    This function fetch stock market quotations.
    """
    import pandas_datareader.data as web
    return web.DataReader(name=company, data_source='stooq')

df = fetch_financial_data()
df.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2022-08-22,135.72,136.32,132.85,133.22,50461504
2022-08-19,140.47,141.11,137.9142,138.23,47792843
2022-08-18,141.32,142.77,140.38,142.3,37458737
2022-08-17,142.69,143.38,140.78,142.1,48149778
2022-08-16,143.905,146.57,142.0,144.78,59102859


### <a name='a3'></a> Zapis/Odczyt danych z pliku CSV

In [4]:
df.to_csv('amazon.csv')

In [5]:
df_nov = df[(df.index.month == 11) & (df.index.year == 2019)]
df_nov

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-11-29,90.889,91.2345,90.0395,90.04,38468800
2019-11-27,90.05,91.225,89.8655,90.9255,60467100
2019-11-26,88.996,89.8515,88.9175,89.847,63808560
2019-11-25,87.6625,88.871,87.662,88.692,69789340
2019-11-22,86.951,87.3215,86.55,87.286,49581620
2019-11-21,87.15,87.3435,86.518,86.7355,53258760
2019-11-20,87.457,88.126,86.706,87.2765,55875180
2019-11-19,87.8495,88.034,87.1515,87.6395,45490700
2019-11-18,86.915,87.685,86.1355,87.6265,56838140
2019-11-15,88.0025,88.084,86.643,86.9745,78622820


In [6]:
df_nov.to_csv('amazon_nov.csv')

In [7]:
new_df = pd.read_csv('amazon_nov.csv', index_col=0)
new_df

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-11-29,90.889,91.2345,90.0395,90.04,38468800
2019-11-27,90.05,91.225,89.8655,90.9255,60467100
2019-11-26,88.996,89.8515,88.9175,89.847,63808560
2019-11-25,87.6625,88.871,87.662,88.692,69789340
2019-11-22,86.951,87.3215,86.55,87.286,49581620
2019-11-21,87.15,87.3435,86.518,86.7355,53258760
2019-11-20,87.457,88.126,86.706,87.2765,55875180
2019-11-19,87.8495,88.034,87.1515,87.6395,45490700
2019-11-18,86.915,87.685,86.1355,87.6265,56838140
2019-11-15,88.0025,88.084,86.643,86.9745,78622820


In [8]:
df_nov.to_excel('amazon_nov.xlsx')

In [9]:
new_df = pd.read_excel('amazon_nov.xlsx', index_col=0)
new_df

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-11-29,90.889,91.2345,90.0395,90.04,38468800
2019-11-27,90.05,91.225,89.8655,90.9255,60467100
2019-11-26,88.996,89.8515,88.9175,89.847,63808560
2019-11-25,87.6625,88.871,87.662,88.692,69789340
2019-11-22,86.951,87.3215,86.55,87.286,49581620
2019-11-21,87.15,87.3435,86.518,86.7355,53258760
2019-11-20,87.457,88.126,86.706,87.2765,55875180
2019-11-19,87.8495,88.034,87.1515,87.6395,45490700
2019-11-18,86.915,87.685,86.1355,87.6265,56838140
2019-11-15,88.0025,88.084,86.643,86.9745,78622820


### <a name='a4'></a> London Bike Dataset

In [10]:
df = pd.read_csv('london_bike.csv')
df.head()

Unnamed: 0,timestamp,cnt,t1,t2,hum,wind_speed,weather_code,is_holiday,is_weekend,season
0,2015-01-04 00:00:00,182,3.0,2.0,93.0,6.0,3.0,0.0,1.0,3.0
1,2015-01-04 01:00:00,138,3.0,2.5,93.0,5.0,1.0,0.0,1.0,3.0
2,2015-01-04 02:00:00,134,2.5,2.5,96.5,0.0,1.0,0.0,1.0,3.0
3,2015-01-04 03:00:00,72,2.0,2.0,100.0,0.0,1.0,0.0,1.0,3.0
4,2015-01-04 04:00:00,47,2.0,0.0,93.0,6.5,1.0,0.0,1.0,3.0


In [11]:
df = df.set_index('timestamp')
df.head()

Unnamed: 0_level_0,cnt,t1,t2,hum,wind_speed,weather_code,is_holiday,is_weekend,season
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2015-01-04 00:00:00,182,3.0,2.0,93.0,6.0,3.0,0.0,1.0,3.0
2015-01-04 01:00:00,138,3.0,2.5,93.0,5.0,1.0,0.0,1.0,3.0
2015-01-04 02:00:00,134,2.5,2.5,96.5,0.0,1.0,0.0,1.0,3.0
2015-01-04 03:00:00,72,2.0,2.0,100.0,0.0,1.0,0.0,1.0,3.0
2015-01-04 04:00:00,47,2.0,0.0,93.0,6.5,1.0,0.0,1.0,3.0


In [12]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 17414 entries, 2015-01-04 00:00:00 to 2017-01-03 23:00:00
Data columns (total 9 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   cnt           17414 non-null  int64  
 1   t1            17414 non-null  float64
 2   t2            17414 non-null  float64
 3   hum           17414 non-null  float64
 4   wind_speed    17414 non-null  float64
 5   weather_code  17414 non-null  float64
 6   is_holiday    17414 non-null  float64
 7   is_weekend    17414 non-null  float64
 8   season        17414 non-null  float64
dtypes: float64(8), int64(1)
memory usage: 1.3+ MB
