# [10 Minutes to pandas — pandas 0.23.3 documentation](https://pandas.pydata.org/pandas-docs/stable/10min.html)

# Object Creation
# Viewing Data
1조: 처음~22번 

# Selection
## Getting
## Selection by Label
## Selection by Position
## Boolean Indexing
## Setting
2조: 23~54번 

# Missing Data
# Operations
## Stats
## Apply
## Histogramming
## String Methods
3조: 55~72번 

# Merge
## Concat
## Join
## Append
4조: 73~90번 

# Grouping

`그룹화`는 다음 단계 중 하나 이상을 포함하는 프로세스를 나타냅니다.

- 몇 가지 기준에 따라 그룹으로 데이터 분할
- 독립적으로 각 그룹에 기능 적용
- 결과를 데이터 구조로 결합

그룹화 섹션을 봅시다

# Reshaping

In [1]:
import pandas as pd

In [2]:
import numpy as np

In [3]:
import matplotlib.pyplot as plt

See the sections on Hierarchical Indexing and Reshaping.

## Stack

In [4]:
tuples = list(zip(*[['bar', 'bar', 'baz', 'baz',
                     'foo', 'foo', 'qux', 'qux'],
                    ['one', 'two', 'one', 'two',
                     'one', 'two', 'one', 'two']]))

In [5]:
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])

In [6]:
df = pd.DataFrame(np.random.randn(8, 2), index=index, columns=['A', 'B'])

In [7]:
df2 = df[:4]

In [8]:
df2

Unnamed: 0_level_0,Unnamed: 1_level_0,A,B
first,second,Unnamed: 2_level_1,Unnamed: 3_level_1
bar,one,1.321056,-1.809664
bar,two,1.874903,-0.415736
baz,one,-0.808714,0.573662
baz,two,0.54935,-0.457618


The stack() method “compresses” a level in the DataFrame’s columns.

In [9]:
stacked = df2.stack()

In [10]:
stacked

first  second   
bar    one     A    1.321056
               B   -1.809664
       two     A    1.874903
               B   -0.415736
baz    one     A   -0.808714
               B    0.573662
       two     A    0.549350
               B   -0.457618
dtype: float64

With a “stacked” DataFrame or Series (having a MultiIndex as the index), the inverse operation of stack() is unstack(), which by default unstacks the last level:

In [11]:
stacked.unstack()

Unnamed: 0_level_0,Unnamed: 1_level_0,A,B
first,second,Unnamed: 2_level_1,Unnamed: 3_level_1
bar,one,1.321056,-1.809664
bar,two,1.874903,-0.415736
baz,one,-0.808714,0.573662
baz,two,0.54935,-0.457618


In [12]:
stacked.unstack(1)

Unnamed: 0_level_0,second,one,two
first,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
bar,A,1.321056,1.874903
bar,B,-1.809664,-0.415736
baz,A,-0.808714,0.54935
baz,B,0.573662,-0.457618


In [13]:
stacked.unstack(0)

Unnamed: 0_level_0,first,bar,baz
second,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
one,A,1.321056,-0.808714
one,B,-1.809664,0.573662
two,A,1.874903,0.54935
two,B,-0.415736,-0.457618


## Pivot Tables

See the section on Pivot Tables

In [14]:
df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 3,
                   'B' : ['A', 'B', 'C'] * 4,
                   'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 2,
                   'D' : np.random.randn(12),
                   'E' : np.random.randn(12)})

In [15]:
df

Unnamed: 0,A,B,C,D,E
0,one,A,foo,-0.701943,-0.656419
1,one,B,foo,-0.471567,0.240199
2,two,C,foo,-0.128653,-0.334675
3,three,A,bar,-0.964232,-2.250993
4,one,B,bar,-0.088475,-1.929603
5,one,C,bar,1.231824,-0.793334
6,two,A,foo,-0.901103,-0.96132
7,three,B,foo,0.998334,-1.939777
8,one,C,foo,0.533768,-0.296282
9,one,A,bar,-0.275955,1.389043


We can produce pivot tables from this data very easily:

In [16]:
pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'])

Unnamed: 0_level_0,C,bar,foo
A,B,Unnamed: 2_level_1,Unnamed: 3_level_1
one,A,-0.275955,-0.701943
one,B,-0.088475,-0.471567
one,C,1.231824,0.533768
three,A,-0.964232,
three,B,,0.998334
three,C,0.111105,
two,A,,-0.901103
two,B,-0.173966,
two,C,,-0.128653


In [3]:
## 향후 삭제해도 되는 부분
In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: import matplotlib.pyplot as plt

In [4]:
df = pd.DataFrame(
    {
        'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],
        'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],
        'C' : np.random.randn(8),
        'D' : np.random.randn(8)
    })

In [5]:
df

Unnamed: 0,A,B,C,D
0,foo,one,1.242203,0.585931
1,bar,one,-0.803583,-0.279826
2,foo,two,0.551023,1.391615
3,bar,three,-0.066917,0.360173
4,foo,two,-0.783369,0.501214
5,bar,two,-0.014651,0.675781
6,foo,one,1.37535,0.88449
7,foo,three,1.114174,1.830266


'A'를 기준으로 그룹화를 한 다음, 그 결과로 나온 그룹에 sum () 함수를 적용합니다.

In [6]:
df.groupby('A').sum()

Unnamed: 0_level_0,C,D
A,Unnamed: 1_level_1,Unnamed: 2_level_1
bar,-0.885152,0.756129
foo,3.49938,5.193516


여러 열로 그룹화하면 계층적 index가 만들어집니다. 여기에도 sum 함수를 적용 할 수 있습니다.

In [7]:
df.groupby(['A', 'B']).sum()

Unnamed: 0_level_0,Unnamed: 1_level_0,C,D
A,B,Unnamed: 2_level_1,Unnamed: 3_level_1
bar,one,-0.803583,-0.279826
bar,three,-0.066917,0.360173
bar,two,-0.014651,0.675781
foo,one,2.617553,1.470421
foo,three,1.114174,1.830266
foo,two,-0.232346,1.892828


# Reshaping
## Stack
## Pivot Tables

# Time Series
6조: 108~126 

# Categoricals
# Plotting
7조: 127~140 

# Getting Data In/Out

## CSV

## HDF5

## Excel

# Gotchas

작업을 수행하려고 시도하면 다음과 같은 예외 상황을 볼 수도 있습니다 :


In [2]:
    if pd.Series([False, True, False]):
    print("I was true")

IndentationError: expected an indented block (<ipython-input-2-9074a2390e8e>, line 2)

설명 및 수행 할 작업은 [비교](https://pandas.pydata.org/pandas-docs/stable/basics.html#basics-compare)를 참조하십시오.

[Gotchas](https://pandas.pydata.org/pandas-docs/stable/gotchas.html#gotchas)도 참조하십시오.

8조: 141~끝

In [None]:
ㅋ