# 2. 파이썬으로 데이터 주무르기, pandas
**pandas를 활용해서 데이터프레임을 다뤄봅시다.**

1. Pandas 시작하기
    - prerequisite : Table
    - pandas import하기
   
2. Pandas로 1차원 데이터 다루기 - Series 
    - Series 선언하기
    - Series vs ndarray
    - Series vs dict
    - Series에 이름 붙이기
3. Pandas로 2차원 데이터 다루기 - dataframe
    - dataframe 선언하기
    - from csv to dataframe
    - dataframe 자료 접근하기

[수업에 사용된 covid 데이터](https://www.kaggle.com/imdevskp/corona-virus-report)

## I. pandas 시작하기

### Prerequisite : Table  

- 행과 열을 이용해서 데이터를 저장하고 관리하는 자료구조(컨테이너)
- 주로 행은 개체, 열은 속성을 나타냄

### Pandas 시작하기

`import pandas` 를 통해서 진행  
`numpy`와 비슷하게 관행적으로 `import pandas as pd`로 사용

In [5]:
import pandas as pd

## II. pandas로 1차원 데이터 다루기 - Series

### Series?

- 1-D labeled **array**
- 딕셔너리처럼 인덱스를 지정해줄 수 있음

In [8]:
s = pd.Series([1,4,9,16,25])

s

0     1
1     4
2     9
3    16
4    25
dtype: int64

In [10]:
t = pd.Series({'one':1, 'two':2, 'three':3, 'four':4, 'five':5})

t

one      1
two      2
three    3
four     4
five     5
dtype: int64

### Series + Numpy

- Series는 ndarray와 유사하다!

In [11]:
s[1]

4

In [13]:
t[1]

2

In [14]:
t[1:3]

two      2
three    3
dtype: int64

In [16]:
s[s > s.median()] # 자기 자신의 median(중앙값)보다 큰 값들만 가지고 와라

3    16
4    25
dtype: int64

In [17]:
s[[3, 1, 4]]

3    16
1     4
4    25
dtype: int64

In [18]:
import numpy as np

np.exp(s)

0    2.718282e+00
1    5.459815e+01
2    8.103084e+03
3    8.886111e+06
4    7.200490e+10
dtype: float64

In [19]:
s.dtype

dtype('int64')

### Series + dict

- series는 **dict**와 유사하다

In [20]:
t

one      1
two      2
three    3
four     4
five     5
dtype: int64

In [21]:
t['one']

1

In [22]:
# Series에 값 추가

t['six'] = 6

t

one      1
two      2
three    3
four     4
five     5
six      6
dtype: int64

In [23]:
'six' in t

True

In [24]:
'seven' in t

False

In [27]:
# 시리즈에서 존재하지 않는 키에 대한 접근은 오류 발생
# t['seven']

In [28]:
t.get('seven')

In [29]:
t.get('seven', 0)

0

### Series에 이름 붙이기

- `name` 속성을 가지고 있다.
- 처음 Series를 만들 때 이름을 붙일 수 있다.

In [33]:
s = pd.Series(np.random.randn(5), name="random_nums") # 정규분포에서 5개의 랜덤 넘버 추출

s

0   -0.671174
1    2.403873
2   -0.994518
3   -1.728656
4    0.911089
Name: random_nums, dtype: float64

In [34]:
s.name = "임의의 난수"

s

0   -0.671174
1    2.403873
2   -0.994518
3   -1.728656
4    0.911089
Name: 임의의 난수, dtype: float64

## III. Pandas로 2차원 데이터 다루기 - dataframe

### dataframe?

- 2-D labeled **table**  (labeled : key와 value의 구조가 있는)
- 인덱스를 지정할 수도 있음 
  
(1차원 데이터와는 다르게 2차원데이터는 table 형태를 가지기 때문에 list보다는 dictionary로 표현하는 것이 적절하다.)

In [35]:
d = {"height":[1, 2, 3, 4], "weight":[30, 40, 50, 60]}

df = pd.DataFrame(d)

df

Unnamed: 0,height,weight
0,1,30
1,2,40
2,3,50
3,4,60


### dtype 확인

numpy에서는 동일한 데이터를 담아서 `numpy.array.dtype`으로 알 수 있었지만  
pandas에서는 각 column별로 데이터 타입이 다를 수 있기 때문에 `dtypes`를 통해 column별로 알아낸다.

In [37]:
df.dtypes

height    int64
weight    int64
dtype: object

### From SCV to dataframe

- comma Separated Value를 DataFrame으로 생성해줄 수 있다.
- `.read_csv()`를 이용

In [39]:
# 동일 경로에 country_wise_latest.csv가 존재하는 경우

covid = pd.read_csv("./country_wise_latest.csv")

covid

Unnamed: 0,Country/Region,Confirmed,Deaths,Recovered,Active,New cases,New deaths,New recovered,Deaths / 100 Cases,Recovered / 100 Cases,Deaths / 100 Recovered,Confirmed last week,1 week change,1 week % increase,WHO Region
0,Afghanistan,36263,1269,25198,9796,106,10,18,3.50,69.49,5.04,35526,737,2.07,Eastern Mediterranean
1,Albania,4880,144,2745,1991,117,6,63,2.95,56.25,5.25,4171,709,17.00,Europe
2,Algeria,27973,1163,18837,7973,616,8,749,4.16,67.34,6.17,23691,4282,18.07,Africa
3,Andorra,907,52,803,52,10,0,0,5.73,88.53,6.48,884,23,2.60,Europe
4,Angola,950,41,242,667,18,1,0,4.32,25.47,16.94,749,201,26.84,Africa
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
182,West Bank and Gaza,10621,78,3752,6791,152,2,0,0.73,35.33,2.08,8916,1705,19.12,Eastern Mediterranean
183,Western Sahara,10,1,8,1,0,0,0,10.00,80.00,12.50,10,0,0.00,Africa
184,Yemen,1691,483,833,375,10,4,36,28.56,49.26,57.98,1619,72,4.45,Eastern Mediterranean
185,Zambia,4552,140,2815,1597,71,1,465,3.08,61.84,4.97,3326,1226,36.86,Africa


### Pandas 활용 1.  일부분만 관찰하기

`head(n)` : 처음 n개의 데이터 참조

In [41]:
# 위에서부터 5개를 관찰하는 방법(함수)

covid.head(5)

Unnamed: 0,Country/Region,Confirmed,Deaths,Recovered,Active,New cases,New deaths,New recovered,Deaths / 100 Cases,Recovered / 100 Cases,Deaths / 100 Recovered,Confirmed last week,1 week change,1 week % increase,WHO Region
0,Afghanistan,36263,1269,25198,9796,106,10,18,3.5,69.49,5.04,35526,737,2.07,Eastern Mediterranean
1,Albania,4880,144,2745,1991,117,6,63,2.95,56.25,5.25,4171,709,17.0,Europe
2,Algeria,27973,1163,18837,7973,616,8,749,4.16,67.34,6.17,23691,4282,18.07,Africa
3,Andorra,907,52,803,52,10,0,0,5.73,88.53,6.48,884,23,2.6,Europe
4,Angola,950,41,242,667,18,1,0,4.32,25.47,16.94,749,201,26.84,Africa


`tail(n)` : 마지막 n개의 데이터를 참조

In [42]:
# 아래에서부터 5개를 관찰하는 방법(함수)

covid.tail(5)

Unnamed: 0,Country/Region,Confirmed,Deaths,Recovered,Active,New cases,New deaths,New recovered,Deaths / 100 Cases,Recovered / 100 Cases,Deaths / 100 Recovered,Confirmed last week,1 week change,1 week % increase,WHO Region
182,West Bank and Gaza,10621,78,3752,6791,152,2,0,0.73,35.33,2.08,8916,1705,19.12,Eastern Mediterranean
183,Western Sahara,10,1,8,1,0,0,0,10.0,80.0,12.5,10,0,0.0,Africa
184,Yemen,1691,483,833,375,10,4,36,28.56,49.26,57.98,1619,72,4.45,Eastern Mediterranean
185,Zambia,4552,140,2815,1597,71,1,465,3.08,61.84,4.97,3326,1226,36.86,Africa
186,Zimbabwe,2704,36,542,2126,192,2,24,1.33,20.04,6.64,1713,991,57.85,Africa


### Pandas 활용 2. 데이터 접근하기

- `df['collum name']` or `df.colum_name`

In [47]:
 covid['WHO Region']

0      Eastern Mediterranean
1                     Europe
2                     Africa
3                     Europe
4                     Africa
               ...          
182    Eastern Mediterranean
183                   Africa
184    Eastern Mediterranean
185                   Africa
186                   Africa
Name: WHO Region, Length: 187, dtype: object

In [46]:
covid.Active # 이 경우에는 column name에 띄어쓰기가 있으면 사용할 수 없고 위의 방식을 써줘야 함

0      9796
1      1991
2      7973
3        52
4       667
       ... 
182    6791
183       1
184     375
185    1597
186    2126
Name: Active, Length: 187, dtype: int64

In [50]:
covid['Confirmed']

0      36263
1       4880
2      27973
3        907
4        950
       ...  
182    10621
183       10
184     1691
185     4552
186     2704
Name: Confirmed, Length: 187, dtype: int64

### Honey Tip!  DataFrame의 각 column은 "Series"다!

데이터 프레임에서 각 column은 Series이다. 즉 dataframe은 Series의 병렬적 모음이다.

In [49]:
type(covid['Confirmed'])

pandas.core.series.Series

In [51]:
covid['Confirmed'][0]

36263

In [52]:
covid['Confirmed'][1:5]

1     4880
2    27973
3      907
4      950
Name: Confirmed, dtype: int64

### Pandas 활용 3. "조건"을 이용해서 데이터 접근하기

In [56]:
# 신규 확진자가 100명이 넘는 나라를 찾아보자!

covid[covid['New cases'] > 100].head(5)

Unnamed: 0,Country/Region,Confirmed,Deaths,Recovered,Active,New cases,New deaths,New recovered,Deaths / 100 Cases,Recovered / 100 Cases,Deaths / 100 Recovered,Confirmed last week,1 week change,1 week % increase,WHO Region
0,Afghanistan,36263,1269,25198,9796,106,10,18,3.5,69.49,5.04,35526,737,2.07,Eastern Mediterranean
1,Albania,4880,144,2745,1991,117,6,63,2.95,56.25,5.25,4171,709,17.0,Europe
2,Algeria,27973,1163,18837,7973,616,8,749,4.16,67.34,6.17,23691,4282,18.07,Africa
6,Argentina,167416,3059,72575,91782,4890,120,2057,1.83,43.35,4.21,130774,36642,28.02,Americas
8,Australia,15303,167,9311,5825,368,6,137,1.09,60.84,1.79,12428,2875,23.13,Western Pacific


In [61]:
# WHO 지역(WHO Region)이 동남아시아인 나라 찾기

covid[covid['WHO Region'] == 'South-East Asia']

Unnamed: 0,Country/Region,Confirmed,Deaths,Recovered,Active,New cases,New deaths,New recovered,Deaths / 100 Cases,Recovered / 100 Cases,Deaths / 100 Recovered,Confirmed last week,1 week change,1 week % increase,WHO Region
13,Bangladesh,226225,2965,125683,97577,2772,37,1801,1.31,55.56,2.36,207453,18772,9.05,South-East Asia
19,Bhutan,99,0,86,13,4,0,1,0.0,86.87,0.0,90,9,10.0,South-East Asia
27,Burma,350,6,292,52,0,0,2,1.71,83.43,2.05,341,9,2.64,South-East Asia
79,India,1480073,33408,951166,495499,44457,637,33598,2.26,64.26,3.51,1155338,324735,28.11,South-East Asia
80,Indonesia,100303,4838,58173,37292,1525,57,1518,4.82,58.0,8.32,88214,12089,13.7,South-East Asia
106,Maldives,3369,15,2547,807,67,0,19,0.45,75.6,0.59,2999,370,12.34,South-East Asia
119,Nepal,18752,48,13754,4950,139,3,626,0.26,73.35,0.35,17844,908,5.09,South-East Asia
158,Sri Lanka,2805,11,2121,673,23,0,15,0.39,75.61,0.52,2730,75,2.75,South-East Asia
167,Thailand,3297,58,3111,128,6,0,2,1.76,94.36,1.86,3250,47,1.45,South-East Asia
168,Timor-Leste,24,0,0,24,0,0,0,0.0,0.0,0.0,24,0,0.0,South-East Asia


In [63]:
covid['WHO Region'].unique() # .unique()를 통해 이 값들이 뭐가 있는 지를 알아낼 수 있다.
# 범주형 자료에서 범주를 unique하게 보여준다.. 즉, 범주의 종류를 확인할 수 있는 방법

array(['Eastern Mediterranean', 'Europe', 'Africa', 'Americas',
       'Western Pacific', 'South-East Asia'], dtype=object)

### Pandas 활용 4. 행을 기준으로 데이터 접근하기

In [68]:
# 예시 데이터 - 도서관 정보

books_dict = {"Available":[True, True, False], "Location":[102, 215, 323], "Genre":["Programming", "Physics", "Math"]}

books_df = pd.DataFrame(books_dict, index=['버그란 무엇인가', '두근두근 물리학', '미분해줘 홈즈'])

books_df

Unnamed: 0,Available,Location,Genre
버그란 무엇인가,True,102,Programming
두근두근 물리학,True,215,Physics
미분해줘 홈즈,False,323,Math


### 인덱스를 이용해서 가져오기 : `.loc[row, col]`

In [69]:
books_df.loc["버그란 무엇인가"]

Available           True
Location             102
Genre        Programming
Name: 버그란 무엇인가, dtype: object

In [70]:
type(books_df.loc["버그란 무엇인가"])

pandas.core.series.Series

In [73]:
# "미분해줘 홈즈" 책이 대출 가능한지?

books_df.loc['미분해줘 홈즈']['Available']

False

In [74]:
books_df.loc['미분해줘 홈즈','Available']

False

### 숫자 인덱스를 이용해서 가져오기 : `.iloc[rowidx, colidx]`

In [81]:
# 인덱스 0행의 인덱스 1열 가져오기

books_df.iloc[0, 1]

102

In [82]:
# 인덱스 1행의 인덱스 0~1열 가져오기

books_df.iloc[1, 0:2]

Available    True
Location      215
Name: 두근두근 물리학, dtype: object

### Pandas 활용 5. groupby

- Split : 특정한 "기준"을 바탕으로 DataFrame을 분할
- Apply : 통계함수 - sum(), mean(), median(),- 을 적용해서 각 데이터를 압축
- Combine : Apply된 결과를 바탕으로 새로운 Series를 생성 (group_key : applied_value)
  
`.groupby()`

In [83]:
covid.head(5)

Unnamed: 0,Country/Region,Confirmed,Deaths,Recovered,Active,New cases,New deaths,New recovered,Deaths / 100 Cases,Recovered / 100 Cases,Deaths / 100 Recovered,Confirmed last week,1 week change,1 week % increase,WHO Region
0,Afghanistan,36263,1269,25198,9796,106,10,18,3.5,69.49,5.04,35526,737,2.07,Eastern Mediterranean
1,Albania,4880,144,2745,1991,117,6,63,2.95,56.25,5.25,4171,709,17.0,Europe
2,Algeria,27973,1163,18837,7973,616,8,749,4.16,67.34,6.17,23691,4282,18.07,Africa
3,Andorra,907,52,803,52,10,0,0,5.73,88.53,6.48,884,23,2.6,Europe
4,Angola,950,41,242,667,18,1,0,4.32,25.47,16.94,749,201,26.84,Africa


In [85]:
# WHO Region별 확진자 수

# 이걸 구현하기 위해 지금은 국가별로 분류되어있는 이 데이터를 WHO Region으로 묶어서 재구성 하는 작업을 하는 듯 하다.

# 1. covid에서 확진자 수 column만 추출한다.
# 2. 이를 covid의 WHO Region을 기준으로 groupby한다.

covid_by_region = covid['Confirmed'].groupby(by=covid["WHO Region"])

covid_by_region
# 여기까지가 Split 단계

<pandas.core.groupby.generic.SeriesGroupBy object at 0x000002078BAF5970>

In [87]:
covid_by_region.sum()
# 이렇게 하면 Apply하고 Combine작업까지 완료되는거..

WHO Region
Africa                    723207
Americas                 8839286
Eastern Mediterranean    1490744
Europe                   3299523
South-East Asia          1835297
Western Pacific           292428
Name: Confirmed, dtype: int64

In [89]:
# 국가 당 감염자 수

covid_by_region.mean() # sum() / 국가 수

WHO Region
Africa                    15066.812500
Americas                 252551.028571
Eastern Mediterranean     67761.090909
Europe                    58920.053571
South-East Asia          183529.700000
Western Pacific           18276.750000
Name: Confirmed, dtype: float64

## Mission:
### 1. covid 데이터에서 100 case 대비 사망률(`Deaths / 100 Cases`)이 가장 높은 국가는?

In [91]:
covid = pd.read_csv("./country_wise_latest.csv")

covid

Unnamed: 0,Country/Region,Confirmed,Deaths,Recovered,Active,New cases,New deaths,New recovered,Deaths / 100 Cases,Recovered / 100 Cases,Deaths / 100 Recovered,Confirmed last week,1 week change,1 week % increase,WHO Region
0,Afghanistan,36263,1269,25198,9796,106,10,18,3.50,69.49,5.04,35526,737,2.07,Eastern Mediterranean
1,Albania,4880,144,2745,1991,117,6,63,2.95,56.25,5.25,4171,709,17.00,Europe
2,Algeria,27973,1163,18837,7973,616,8,749,4.16,67.34,6.17,23691,4282,18.07,Africa
3,Andorra,907,52,803,52,10,0,0,5.73,88.53,6.48,884,23,2.60,Europe
4,Angola,950,41,242,667,18,1,0,4.32,25.47,16.94,749,201,26.84,Africa
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
182,West Bank and Gaza,10621,78,3752,6791,152,2,0,0.73,35.33,2.08,8916,1705,19.12,Eastern Mediterranean
183,Western Sahara,10,1,8,1,0,0,0,10.00,80.00,12.50,10,0,0.00,Africa
184,Yemen,1691,483,833,375,10,4,36,28.56,49.26,57.98,1619,72,4.45,Eastern Mediterranean
185,Zambia,4552,140,2815,1597,71,1,465,3.08,61.84,4.97,3326,1226,36.86,Africa


In [95]:
covid.columns

Index(['Country/Region', 'Confirmed', 'Deaths', 'Recovered', 'Active',
       'New cases', 'New deaths', 'New recovered', 'Deaths / 100 Cases',
       'Recovered / 100 Cases', 'Deaths / 100 Recovered',
       'Confirmed last week', '1 week change', '1 week % increase',
       'WHO Region'],
      dtype='object')

In [106]:
mission1 = covid['Deaths / 100 Cases']

mission1

0       3.50
1       2.95
2       4.16
3       5.73
4       4.32
       ...  
182     0.73
183    10.00
184    28.56
185     3.08
186     1.33
Name: Deaths / 100 Cases, Length: 187, dtype: float64

In [107]:
covid[covid['Deaths / 100 Cases'] == max(mission1)]

Unnamed: 0,Country/Region,Confirmed,Deaths,Recovered,Active,New cases,New deaths,New recovered,Deaths / 100 Cases,Recovered / 100 Cases,Deaths / 100 Recovered,Confirmed last week,1 week change,1 week % increase,WHO Region
184,Yemen,1691,483,833,375,10,4,36,28.56,49.26,57.98,1619,72,4.45,Eastern Mediterranean


In [108]:
covid[covid['Deaths / 100 Cases'] == max(mission1)]['Country/Region']

184    Yemen
Name: Country/Region, dtype: object

### 2. covid 데이터에서 신규 확진자가 없는 나라 중 WHO Region이 'Europe'를 모두 출력하면?  
Hint : 한 줄에 동시에 두가지 조건을 Apply하는 경우 Warning이 발생할 수 있습니다.

In [105]:
data1 = covid[covid['New cases'] == 0]

data1

Unnamed: 0,Country/Region,Confirmed,Deaths,Recovered,Active,New cases,New deaths,New recovered,Deaths / 100 Cases,Recovered / 100 Cases,Deaths / 100 Recovered,Confirmed last week,1 week change,1 week % increase,WHO Region
14,Barbados,110,7,94,9,0,0,0,6.36,85.45,7.45,106,4,3.77,Americas
17,Belize,48,2,26,20,0,0,0,4.17,54.17,7.69,40,8,20.0,Americas
18,Benin,1770,35,1036,699,0,0,0,1.98,58.53,3.38,1602,168,10.49,Africa
24,Brunei,141,3,138,0,0,0,0,2.13,97.87,2.17,141,0,0.0,Western Pacific
27,Burma,350,6,292,52,0,0,2,1.71,83.43,2.05,341,9,2.64,South-East Asia
33,Central African Republic,4599,59,1546,2994,0,0,0,1.28,33.62,3.82,4548,51,1.12,Africa
38,Comoros,354,7,328,19,0,0,0,1.98,92.66,2.13,334,20,5.99,Africa
49,Dominica,18,0,18,0,0,0,0,0.0,100.0,0.0,18,0,0.0,Americas
54,Equatorial Guinea,3071,51,842,2178,0,0,0,1.66,27.42,6.06,3071,0,0.0,Africa
56,Estonia,2034,69,1923,42,0,0,1,3.39,94.54,3.59,2021,13,0.64,Europe


In [111]:
mission2 = data1[data1['WHO Region'] == 'Europe']

mission2

Unnamed: 0,Country/Region,Confirmed,Deaths,Recovered,Active,New cases,New deaths,New recovered,Deaths / 100 Cases,Recovered / 100 Cases,Deaths / 100 Recovered,Confirmed last week,1 week change,1 week % increase,WHO Region
56,Estonia,2034,69,1923,42,0,0,1,3.39,94.54,3.59,2021,13,0.64,Europe
75,Holy See,12,0,12,0,0,0,0,0.0,100.0,0.0,12,0,0.0,Europe
95,Latvia,1219,31,1045,143,0,0,0,2.54,85.73,2.97,1192,27,2.27,Europe
100,Liechtenstein,86,1,81,4,0,0,0,1.16,94.19,1.23,86,0,0.0,Europe
113,Monaco,116,4,104,8,0,0,0,3.45,89.66,3.85,109,7,6.42,Europe
143,San Marino,699,42,657,0,0,0,0,6.01,93.99,6.39,699,0,0.0,Europe
157,Spain,272421,28432,150376,93613,0,0,0,10.44,55.2,18.91,264836,7585,2.86,Europe


### 3. 다음 [데이터](https://www.kaggle.com/neuromusic/avocado-prices)를 이용해 각 Region별로 아보카도가 가장 비싼 평균가격(AveragePrice)을 출력하면?

In [112]:
avocado = pd.read_csv('./avocado.csv')

avocado

Unnamed: 0.1,Unnamed: 0,Date,AveragePrice,Total Volume,4046,4225,4770,Total Bags,Small Bags,Large Bags,XLarge Bags,type,year,region
0,0,2015-12-27,1.33,64236.62,1036.74,54454.85,48.16,8696.87,8603.62,93.25,0.0,conventional,2015,Albany
1,1,2015-12-20,1.35,54876.98,674.28,44638.81,58.33,9505.56,9408.07,97.49,0.0,conventional,2015,Albany
2,2,2015-12-13,0.93,118220.22,794.70,109149.67,130.50,8145.35,8042.21,103.14,0.0,conventional,2015,Albany
3,3,2015-12-06,1.08,78992.15,1132.00,71976.41,72.58,5811.16,5677.40,133.76,0.0,conventional,2015,Albany
4,4,2015-11-29,1.28,51039.60,941.48,43838.39,75.78,6183.95,5986.26,197.69,0.0,conventional,2015,Albany
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
18244,7,2018-02-04,1.63,17074.83,2046.96,1529.20,0.00,13498.67,13066.82,431.85,0.0,organic,2018,WestTexNewMexico
18245,8,2018-01-28,1.71,13888.04,1191.70,3431.50,0.00,9264.84,8940.04,324.80,0.0,organic,2018,WestTexNewMexico
18246,9,2018-01-21,1.87,13766.76,1191.92,2452.79,727.94,9394.11,9351.80,42.31,0.0,organic,2018,WestTexNewMexico
18247,10,2018-01-14,1.93,16205.22,1527.63,2981.04,727.01,10969.54,10919.54,50.00,0.0,organic,2018,WestTexNewMexico


In [115]:
avocado_by_region = avocado['AveragePrice'].groupby(by=avocado['region'])

avocado_by_region

<pandas.core.groupby.generic.SeriesGroupBy object at 0x000002078BCDED00>

In [118]:
mission3 = avocado_by_region.max()

mission3

region
Albany                 2.13
Atlanta                2.75
BaltimoreWashington    2.28
Boise                  2.79
Boston                 2.19
BuffaloRochester       2.57
California             2.58
Charlotte              2.83
Chicago                2.30
CincinnatiDayton       2.20
Columbus               2.22
DallasFtWorth          1.90
Denver                 2.16
Detroit                2.08
GrandRapids            2.73
GreatLakes             1.98
HarrisburgScranton     2.27
HartfordSpringfield    2.68
Houston                1.92
Indianapolis           2.10
Jacksonville           2.99
LasVegas               3.03
LosAngeles             2.44
Louisville             2.29
MiamiFtLauderdale      3.05
Midsouth               2.17
Nashville              2.24
NewOrleansMobile       2.32
NewYork                2.65
Northeast              2.31
NorthernNewEngland     1.96
Orlando                2.87
Philadelphia           2.45
PhoenixTucson          2.62
Pittsburgh             1.83
Plains       