# Pandas Tutorial 1 - Series

## what is pandas?

```
source : https://pandas.pydata.org/docs/getting_started/overview.html

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real-world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis/manipulation tool available in any language. It is already well on its way toward this goal.

```

# 1. loading modules

In [78]:
import pandas as pd #pd is an alias for pandas
import numpy as np #np is an alias for numpy
import matplotlib.pyplot as plt #plt is an alias for pyplot

#alias : The community agreed alias for pandas is pd, so loading pandas as pd is assumed standard practice for all of the pandas documentation.

!pip install pandas

source : https://pandas.pydata.org/docs/getting_started/overview.html

two primary data structures of pandas   
    1. Series : 1D labeled array   
    2. DataFrame : general 2D labeled structure   

![image](./images/SeriesandDataframe.png)
source : https://www.learndatasci.com/tutorials/python-pandas-tutorial-complete-introduction-for-beginners/

# 2. creating Series

You can construct a Series using
1. list
2. dictionary
3. numpy ndarray

source : https://wikidocs.net/32829     
https://doorbw.tistory.com/172(상단)          
https://www.tutorialspoint.com/python_pandas/python_pandas_series.htm   
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html(하단)  

### 2-1. using list

In [21]:
sr1 = pd.Series([1, 2, 3, 4])
sr1

0    1
1    2
2    3
3    4
dtype: int64

In [26]:
values = [100, 200, 300, 400, 500]
index = ['apple', 'banana', 'watermelon', 'grapes', 'orange']
sr2 = pd.Series(values, index=index)
sr2

apple         100
banana        200
watermelon    300
grapes        400
orange        500
dtype: int64

### 2-2. using dictionary

In [27]:
sdata = {'Kim' : 100, 'Park' : 200, 'Lee' : 300, 'Choi' : 400}
sr3 = pd.Series(sdata)
sr3

Kim     100
Park    200
Lee     300
Choi    400
dtype: int64

### 2-3. QUIZ

* Create the same series as below using dictionary. Name it as sr4 and print the result.

![image](./images/quiz1-4.png)

### Answer

In [45]:
sdata = {1 : 'Thomas', 2 : 'Jane', 3 : 'Edward', 4 : 'Jessica', 5 : 'Irene'}
sr4 = pd.Series(sdata)
sr4 #series from above

1     Thomas
2       Jane
3     Edward
4    Jessica
5      Irene
dtype: object

# 3. values and indices

### 3-1. values of series

In [38]:
sdata = {'Kim' : 100, 'Park' : 200, 'Lee' : 300, 'Choi' : 400}
sr3 = pd.Series(sdata)
sr3 #series from above

Kim     100
Park    200
Lee     300
Choi    400
dtype: int64

In [39]:
sr3.values

array([100, 200, 300, 400])

### 3-2. indices of series

In [40]:
sr3.index

Index(['Kim', 'Park', 'Lee', 'Choi'], dtype='object')

### 3-3. changing indices

In [41]:
sr3.index = ['Seoul', 'Pusan', 'Daegu', 'Daejeon']
sr3

Seoul      100
Pusan      200
Daegu      300
Daejeon    400
dtype: int64

### 3-4. QUIZ

* Change the names of indices in sr3 into green, yellow, blue, red and return the result.

### Answer

In [44]:
sr3.index = ['green', 'yellow', 'blue', 'red']
sr3

green     100
yellow    200
blue      300
red       400
dtype: int64

# 4. access data from Series
https://www.tutorialspoint.com/python_pandas/python_pandas_series.htm

### 4-1. single element

In [61]:
sr4 = pd.Series([100, 200, 300, 400], index=['a', 'b', 'c', 'd']) #series from above
sr4

a    100
b    200
c    300
d    400
dtype: int64

In [62]:
sr4[0]

100

In [63]:
sr4[3]

400

In [64]:
sr4[6] #error when the index is out of range

IndexError: index 6 is out of bounds for axis 0 with size 4

In [65]:
sr4[-1] #-1 : the last element

400

### 4-2. multiple elements

In [66]:
sr4[:2]

a    100
b    200
dtype: int64

In [67]:
sr4[:3]

a    100
b    200
c    300
dtype: int64

In [68]:
sr4[:]

a    100
b    200
c    300
d    400
dtype: int64

### 4-3. retrieving data like dictionary does

In [72]:
sr4['a']

100

In [76]:
sr4['a':'c']

a    100
b    200
c    300
dtype: int64

In [77]:
sr4[:'d']

a    100
b    200
c    300
d    400
dtype: int64

### 4-4. Quiz

* Retrieve the last three elements from the Series below using index.   
![image](./images/quiz3-3.png)

### Answer

In [71]:
sr5[-3:]

D    586
E    168
F    946
dtype: int64