## 시계열데이터의 기초 문법
- datetime 모듈
    - 날짜와 시간의 데이터를 다루기 위한 클래스
    - datetime 클래스의 다양한 문법
        - year, month, day, hour, minute, second, now 등등 
        - strftime : 날짜와 시간을 지정된 문자열로 변환
        - strptime : 문자열 지정된 형식에 따라 datetime
- timedelta 
    - 두 날짜와 시간의 차이를 계산
    - 일, 시간, 분, 초 등등 간격 계산, 덧셈 뺄셈 가능
    
- Pandas 시계열 문법
    - pd.to_datetime : 문자열을 날짜/시간 객체로 변환
    - pd.date_range : 특벙 범위의 날짜 인덱스 생성
    - 판다스에 제공하는 다양한 문법들이 존재한다. 
        - resample, shift, rolling 등등
- Datetime Index 설정할 수 있다.
    - 가장 쉽게 다양한 전처리가 가능하다.
    

In [4]:
## datetime


import datetime
now = datetime.datetime.now()
print(now)

2024-06-30 21:06:21.991162


In [6]:
#시계열데이터를 만들 수 있음
sp_date =datetime.datetime(2024,6,30,21,6,21)

In [8]:
print(sp_date)

2024-06-30 21:06:21


In [9]:
type(now)

datetime.datetime

In [10]:
#datetime이라면 다양한 문법을 사용할 수 있다.
#year. month, hour 등등 뽑아낼 수 있다.
now.year

2024

In [11]:
now.month

6

In [12]:
now.hour

21

In [13]:
#날짜들도 문법으로 다양하게 조절할 수 있다.
#datetime 패키지 네에 date.today() 오늘 날짜를 추출할 수 있다
datetime.date.today()

datetime.date(2024, 6, 30)

In [16]:
##strftime 
## 날짜와 시간의 문자열을 변환해서 보여준다.
## %Y-%m-%d %H:%M:%S

now.strftime('%Y-%m-%d %H:%M:%S')

'2024-06-30 21:06:21'

In [19]:
now.strftime('%Y-%m')

'2024-06'

In [20]:
now.strftime('%y-%m')

'24-06'

In [22]:
##strptime 시계열데이터를 만들 수 있다.

date_ob=datetime.datetime.strptime('2024-06-30 21:06:12','%Y-%m-%d %H:%M:%S')

In [25]:
print(date_ob)
print(type(date_ob))

2024-06-30 21:06:12
<class 'datetime.datetime'>


In [28]:
## timedelta 
## 두 날짜 또는 시간의 차이를 나타낸다.
dt=datetime.timedelta(days=5, hours=3)

In [30]:
dt

datetime.timedelta(days=5, seconds=10800)

In [32]:
now + dt # 미래 구할 수 있고, 만들 수 있고

datetime.datetime(2024, 7, 6, 0, 6, 21, 991162)

In [33]:
now - dt #과거 데이터도 만들 수 있다.

datetime.datetime(2024, 6, 25, 18, 6, 21, 991162)

In [35]:
## 주말이나, 평일등 계산이 가능하다.
now.weekday() # 주말이나 평일인지 수치로 나온다. 6 일요일 , 5 토요일 4 금요일 3 목요일 2 수요일 1 화요일 0 월요일

6

In [38]:
# 다음날 평일 같은 경우를 계산하자
new_weekday=now + datetime.timedelta(days=(7- now.weekday())%7)

In [39]:
print(new_weekday)

2024-07-01 21:06:21.991162


In [43]:
## relativedelta 값을 3개월 전, 1년 전 이런식으로 쉽게 더하고 빼기가 가능하다.

from dateutil.relativedelta import relativedelta

# 현재 날짜와 3개월, 또는 1년과 같은 큰 차이를 쉽게 계산할 수 있다.
now +relativedelta(months=3)

datetime.datetime(2024, 9, 30, 21, 6, 21, 991162)

In [44]:
now -relativedelta(months=3)

datetime.datetime(2024, 3, 30, 21, 6, 21, 991162)

In [45]:
now -relativedelta(years=1)

datetime.datetime(2023, 6, 30, 21, 6, 21, 991162)

In [46]:
now -relativedelta(days=1)

datetime.datetime(2024, 6, 29, 21, 6, 21, 991162)

In [49]:
##pd.date_range
import pandas as pd
dt_rn=pd.date_range(start='2023-06-30', end='2024-06-30', freq='D') # D, y, m ,w 일, 연, 월, 주

In [51]:
dt_rn

DatetimeIndex(['2023-06-30', '2023-07-01', '2023-07-02', '2023-07-03',
               '2023-07-04', '2023-07-05', '2023-07-06', '2023-07-07',
               '2023-07-08', '2023-07-09',
               ...
               '2024-06-21', '2024-06-22', '2024-06-23', '2024-06-24',
               '2024-06-25', '2024-06-26', '2024-06-27', '2024-06-28',
               '2024-06-29', '2024-06-30'],
              dtype='datetime64[ns]', length=367, freq='D')

In [56]:
import calendar

cal_6=calendar.month(2024, 6)

In [57]:
print(cal_6)

     June 2024
Mo Tu We Th Fr Sa Su
                1  2
 3  4  5  6  7  8  9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30



In [59]:
## 주말여부, 평일여부 확인가능 코드 간단하게 만들 수 있다.

now.weekday() >=5 #주말을 확인하는 코드

True

In [60]:
now.weekday() <5 #주말을 확인하는 코드

False

### crime 데이터를 불러와서 시계열데이터로 조작해 보자!

In [62]:
df=pd.read_csv('crime.csv')

In [64]:
#datetime 자료형이 시계열데이터의 문법을 적용 받는다.
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 460911 entries, 0 to 460910
Data columns (total 9 columns):
 #   Column               Non-Null Count   Dtype  
---  ------               --------------   -----  
 0   Unnamed: 0           460911 non-null  int64  
 1   OFFENSE_TYPE_ID      460911 non-null  object 
 2   OFFENSE_CATEGORY_ID  460911 non-null  object 
 3   REPORTED_DATE        460911 non-null  object 
 4   GEO_LON              457296 non-null  float64
 5   GEO_LAT              457296 non-null  float64
 6   NEIGHBORHOOD_ID      460911 non-null  object 
 7   IS_CRIME             460911 non-null  int64  
 8   IS_TRAFFIC           460911 non-null  int64  
dtypes: float64(2), int64(3), object(4)
memory usage: 31.6+ MB


In [68]:
df_1 = df.copy()

In [70]:
#시계열데이터로 자료형을 변환해야 한다.
#pd.to_datetime : 문자열을 날짜/시간 객체로 변환

df_1['REPORTED_DATE']=pd.to_datetime(df_1['REPORTED_DATE'])

In [71]:
df_1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 460911 entries, 0 to 460910
Data columns (total 9 columns):
 #   Column               Non-Null Count   Dtype         
---  ------               --------------   -----         
 0   Unnamed: 0           460911 non-null  int64         
 1   OFFENSE_TYPE_ID      460911 non-null  object        
 2   OFFENSE_CATEGORY_ID  460911 non-null  object        
 3   REPORTED_DATE        460911 non-null  datetime64[ns]
 4   GEO_LON              457296 non-null  float64       
 5   GEO_LAT              457296 non-null  float64       
 6   NEIGHBORHOOD_ID      460911 non-null  object        
 7   IS_CRIME             460911 non-null  int64         
 8   IS_TRAFFIC           460911 non-null  int64         
dtypes: datetime64[ns](1), float64(2), int64(3), object(3)
memory usage: 31.6+ MB


In [73]:
df_1['REPORTED_DATE']

0        2014-06-29 02:01:00
1        2014-06-29 01:54:00
2        2014-06-29 02:00:00
3        2014-06-29 02:18:00
4        2014-06-29 04:17:00
                 ...        
460906   2017-09-13 05:48:00
460907   2017-09-12 20:37:00
460908   2017-09-12 16:32:00
460909   2017-09-12 13:04:00
460910   2017-09-12 09:30:00
Name: REPORTED_DATE, Length: 460911, dtype: datetime64[ns]

In [76]:
##시계열 데이터처럼 문법을 사용하려면 -> 한 번 더 가공해야 한다.
## Datetime Index로 만든다.
## set_index() 
## 시계열 데이터로 인덱스 변환
df_1=df_1.set_index('REPORTED_DATE') # 이렇게 만들면 자유롭게 시계열데이터 문법을 사용할 수 있다.

In [79]:
## loc 문법을 동일하게 적용
df_1.loc['2014'] # 내가 원하는 데이터만 쉽게 추출된다.

Unnamed: 0_level_0,Unnamed: 0,OFFENSE_TYPE_ID,OFFENSE_CATEGORY_ID,GEO_LON,GEO_LAT,NEIGHBORHOOD_ID,IS_CRIME,IS_TRAFFIC
REPORTED_DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2014-06-29 02:01:00,0,traffic-accident-dui-duid,traffic-accident,-105.000149,39.745753,cbd,0,1
2014-06-29 01:54:00,1,vehicular-eluding-no-chase,all-other-crimes,-104.884660,39.738702,east-colfax,1,0
2014-06-29 02:00:00,2,disturbing-the-peace,public-disorder,-105.020719,39.706674,athmar-park,1,0
2014-06-29 02:18:00,3,curfew,public-disorder,-105.001552,39.769505,sunnyside,1,0
2014-06-29 04:17:00,4,aggravated-assault,aggravated-assault,-105.018557,39.679229,college-view-south-platte,1,0
...,...,...,...,...,...,...,...,...
2014-01-21 07:32:00,457344,traffic-accident,traffic-accident,-104.998902,39.711204,baker,0,1
2014-05-22 10:55:00,457385,harassment-dv,public-disorder,-104.905262,39.724593,hilltop,1,0
2014-01-20 17:27:00,457769,traffic-accident,traffic-accident,-104.968227,39.739752,cheesman-park,0,1
2014-01-02 10:44:00,458291,theft-from-bldg,larceny,-105.046018,39.720992,barnum-west,1,0


In [81]:
df_1.loc['2014-06-29'] # 내가 원하는 데이터만 쉽게 추출된다.

Unnamed: 0_level_0,Unnamed: 0,OFFENSE_TYPE_ID,OFFENSE_CATEGORY_ID,GEO_LON,GEO_LAT,NEIGHBORHOOD_ID,IS_CRIME,IS_TRAFFIC
REPORTED_DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2014-06-29 02:01:00,0,traffic-accident-dui-duid,traffic-accident,-105.000149,39.745753,cbd,0,1
2014-06-29 01:54:00,1,vehicular-eluding-no-chase,all-other-crimes,-104.884660,39.738702,east-colfax,1,0
2014-06-29 02:00:00,2,disturbing-the-peace,public-disorder,-105.020719,39.706674,athmar-park,1,0
2014-06-29 02:18:00,3,curfew,public-disorder,-105.001552,39.769505,sunnyside,1,0
2014-06-29 04:17:00,4,aggravated-assault,aggravated-assault,-105.018557,39.679229,college-view-south-platte,1,0
...,...,...,...,...,...,...,...,...
2014-06-29 15:36:00,306181,theft-of-motor-vehicle,auto-theft,-105.024077,39.770150,sunnyside,1,0
2014-06-29 16:09:00,359150,sex-aslt-non-rape-pot,sexual-assault,,,east-colfax,1,0
2014-06-29 23:30:00,360360,menacing-felony-w-weap,aggravated-assault,-104.819770,39.796053,montbello,1,0
2014-06-29 16:33:00,362974,assault-simple,other-crimes-against-persons,-104.929721,39.755384,north-park-hill,1,0


In [82]:
df_1.loc['Dec 2014'] # 내가 원하는 데이터만 쉽게 추출된다.

Unnamed: 0_level_0,Unnamed: 0,OFFENSE_TYPE_ID,OFFENSE_CATEGORY_ID,GEO_LON,GEO_LAT,NEIGHBORHOOD_ID,IS_CRIME,IS_TRAFFIC
REPORTED_DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2014-12-19 17:42:00,1219,traffic-accident,traffic-accident,-104.940355,39.781688,northeast-park-hill,0,1
2014-12-06 08:25:00,1225,disturbing-the-peace,public-disorder,-104.817529,39.773653,montbello,1,0
2014-12-19 08:29:00,1301,burglary-residence-no-force,burglary,-104.901114,39.729047,lowry-field,1,0
2014-12-01 22:13:00,1322,liquor-possession,drug-alcohol,-104.966870,39.738576,cheesman-park,1,0
2014-12-30 08:26:00,1341,theft-items-from-vehicle,theft-from-motor-vehicle,-105.037817,39.771111,berkeley,1,0
...,...,...,...,...,...,...,...,...
2014-12-18 22:36:00,377801,criminal-trespassing,all-other-crimes,-104.999197,39.739042,lincoln-park,1,0
2014-12-10 19:35:00,377834,assault-simple,other-crimes-against-persons,-105.033088,39.706098,westwood,1,0
2014-12-16 13:51:00,379049,assault-simple,other-crimes-against-persons,-104.921734,39.753780,north-park-hill,1,0
2014-12-20 08:37:00,416416,drug-methampetamine-sell,drug-alcohol,-105.008755,39.697055,athmar-park,1,0


In [83]:
#데이터를 정렬 가능하다.
df_1.loc['Dec 2014'].sort_index()

Unnamed: 0_level_0,Unnamed: 0,OFFENSE_TYPE_ID,OFFENSE_CATEGORY_ID,GEO_LON,GEO_LAT,NEIGHBORHOOD_ID,IS_CRIME,IS_TRAFFIC
REPORTED_DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2014-12-01 00:01:00,178089,theft-items-from-vehicle,theft-from-motor-vehicle,-104.904967,39.701152,washington-virginia-vale,1,0
2014-12-01 00:19:00,216781,traf-other,all-other-crimes,-105.024963,39.677701,college-view-south-platte,1,0
2014-12-01 00:28:00,216778,theft-shoplift,larceny,-105.053677,39.675409,harvey-park,1,0
2014-12-01 00:33:00,2852,theft-of-services,larceny,-105.052872,39.710916,westwood,1,0
2014-12-01 00:47:00,179534,liquor-possession,drug-alcohol,-104.932821,39.754674,north-park-hill,1,0
...,...,...,...,...,...,...,...,...
2014-12-31 23:59:00,126879,theft-of-motor-vehicle,auto-theft,-105.031583,39.712121,barnum,1,0
2014-12-31 23:59:00,148309,traffic-accident,traffic-accident,-105.040484,39.751199,sloan-lake,0,1
2014-12-31 23:59:00,226091,weapon-carrying-concealed,all-other-crimes,-105.017647,39.718170,valverde,1,0
2014-12-31 23:59:00,226092,disturbing-the-peace,public-disorder,-105.017647,39.718170,valverde,1,0


In [84]:
## 중간 날짜를 추출할 수 있다.
#타임으로 잡아서
df_1.between_time('01:00','03:00')

Unnamed: 0_level_0,Unnamed: 0,OFFENSE_TYPE_ID,OFFENSE_CATEGORY_ID,GEO_LON,GEO_LAT,NEIGHBORHOOD_ID,IS_CRIME,IS_TRAFFIC
REPORTED_DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2014-06-29 02:01:00,0,traffic-accident-dui-duid,traffic-accident,-105.000149,39.745753,cbd,0,1
2014-06-29 01:54:00,1,vehicular-eluding-no-chase,all-other-crimes,-104.884660,39.738702,east-colfax,1,0
2014-06-29 02:00:00,2,disturbing-the-peace,public-disorder,-105.020719,39.706674,athmar-park,1,0
2014-06-29 02:18:00,3,curfew,public-disorder,-105.001552,39.769505,sunnyside,1,0
2014-06-29 02:56:00,6,traffic-accident-dui-duid,traffic-accident,-105.052956,39.733315,villa-park,0,1
...,...,...,...,...,...,...,...,...
2017-09-22 01:30:00,460774,drug-methampetamine-sell,drug-alcohol,-105.025080,39.699230,westwood,1,0
2017-08-28 01:07:00,460852,traf-other,all-other-crimes,-105.011016,39.696419,ruby-hill,1,0
2017-09-13 02:21:00,460867,assault-simple,other-crimes-against-persons,-104.925733,39.654184,university-hills,1,0
2017-09-13 02:15:00,460889,traffic-accident-hit-and-run,traffic-accident,-105.043950,39.787436,regis,0,1


In [85]:
df_1.at_time('18:00')

Unnamed: 0_level_0,Unnamed: 0,OFFENSE_TYPE_ID,OFFENSE_CATEGORY_ID,GEO_LON,GEO_LAT,NEIGHBORHOOD_ID,IS_CRIME,IS_TRAFFIC
REPORTED_DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2014-03-07 18:00:00,1284,burglary-business-no-force,burglary,-104.994540,39.753174,union-station,1,0
2013-06-06 18:00:00,2220,traffic-accident,traffic-accident,-104.958966,39.687381,cory-merrill,0,1
2012-02-27 18:00:00,2315,violation-of-restraining-order,all-other-crimes,-105.035473,39.739885,west-colfax,1,0
2014-03-02 18:00:00,2551,criminal-mischief-mtr-veh,public-disorder,-105.009224,39.788024,chaffee-park,1,0
2015-03-29 18:00:00,2644,threats-to-injure,public-disorder,-104.912386,39.640237,southmoor-park,1,0
...,...,...,...,...,...,...,...,...
2013-09-22 18:00:00,457847,traf-other,all-other-crimes,-104.967097,39.761226,whittier,1,0
2017-09-13 18:00:00,458362,criminal-trespassing,all-other-crimes,-104.993106,39.745407,cbd,1,0
2017-09-17 18:00:00,459126,theft-other,larceny,-104.993555,39.753142,union-station,1,0
2017-08-26 18:00:00,459826,traffic-accident,traffic-accident,-104.893188,39.633405,hampden-south,0,1


In [88]:
## resample
## 시계열 데이터의 특성으로 월별, 주별, 일별로 다른 컬럼의 값을 추출할 수 있다.
## IS_CRIME 범죄 컬럼에 대해서 확인하고 싶다.
df_crime=df_1[['IS_CRIME']]

In [90]:
df_crime.resample('Y').sum()

Unnamed: 0_level_0,IS_CRIME
REPORTED_DATE,Unnamed: 1_level_1
2012-12-31,37286
2013-12-31,50698
2014-12-31,62690
2015-12-31,65894
2016-12-31,67381
2017-12-31,51902


In [91]:
df_crime.resample('M').sum()

Unnamed: 0_level_0,IS_CRIME
REPORTED_DATE,Unnamed: 1_level_1
2012-01-31,2660
2012-02-29,2353
2012-03-31,2869
2012-04-30,3070
2012-05-31,3321
...,...
2017-05-31,5965
2017-06-30,5972
2017-07-31,6005
2017-08-31,6568


In [92]:
df_crime.resample('W').sum()

Unnamed: 0_level_0,IS_CRIME
REPORTED_DATE,Unnamed: 1_level_1
2012-01-08,551
2012-01-15,600
2012-01-22,654
2012-01-29,655
2012-02-05,536
...,...
2017-09-03,1458
2017-09-10,1309
2017-09-17,1474
2017-09-24,1332


In [93]:
#원본데이터에서  추출 가능
df_1['IS_CRIME'].resample('M').sum()


REPORTED_DATE
2012-01-31    2660
2012-02-29    2353
2012-03-31    2869
2012-04-30    3070
2012-05-31    3321
              ... 
2017-05-31    5965
2017-06-30    5972
2017-07-31    6005
2017-08-31    6568
2017-09-30    5417
Freq: M, Name: IS_CRIME, Length: 69, dtype: int64

In [95]:
df_1['IS_CRIME'].resample('M').mean()


REPORTED_DATE
2012-01-31    0.628990
2012-02-29    0.591652
2012-03-31    0.652787
2012-04-30    0.658657
2012-05-31    0.645105
                ...   
2017-05-31    0.730647
2017-06-30    0.742509
2017-07-31    0.747169
2017-08-31    0.755464
2017-09-30    0.737709
Freq: M, Name: IS_CRIME, Length: 69, dtype: float64

In [97]:
df_1[['IS_CRIME','IS_TRAFFIC']].resample('Y').sum()

Unnamed: 0_level_0,IS_CRIME,IS_TRAFFIC
REPORTED_DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
2012-12-31,37286,19786
2013-12-31,50698,18862
2014-12-31,62690,21763
2015-12-31,65894,23310
2016-12-31,67381,23744
2017-12-31,51902,17836


In [99]:
df_1

Unnamed: 0_level_0,Unnamed: 0,OFFENSE_TYPE_ID,OFFENSE_CATEGORY_ID,GEO_LON,GEO_LAT,NEIGHBORHOOD_ID,IS_CRIME,IS_TRAFFIC
REPORTED_DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2014-06-29 02:01:00,0,traffic-accident-dui-duid,traffic-accident,-105.000149,39.745753,cbd,0,1
2014-06-29 01:54:00,1,vehicular-eluding-no-chase,all-other-crimes,-104.884660,39.738702,east-colfax,1,0
2014-06-29 02:00:00,2,disturbing-the-peace,public-disorder,-105.020719,39.706674,athmar-park,1,0
2014-06-29 02:18:00,3,curfew,public-disorder,-105.001552,39.769505,sunnyside,1,0
2014-06-29 04:17:00,4,aggravated-assault,aggravated-assault,-105.018557,39.679229,college-view-south-platte,1,0
...,...,...,...,...,...,...,...,...
2017-09-13 05:48:00,460906,burglary-business-by-force,burglary,-105.033840,39.762365,west-highland,1,0
2017-09-12 20:37:00,460907,weapon-unlawful-discharge-of,all-other-crimes,-105.040313,39.721264,barnum-west,1,0
2017-09-12 16:32:00,460908,traf-habitual-offender,all-other-crimes,-104.847024,39.779596,montbello,1,0
2017-09-12 13:04:00,460909,criminal-mischief-other,public-disorder,-104.949183,39.756353,skyland,1,0


In [100]:
df_1.shift(20)

Unnamed: 0_level_0,Unnamed: 0,OFFENSE_TYPE_ID,OFFENSE_CATEGORY_ID,GEO_LON,GEO_LAT,NEIGHBORHOOD_ID,IS_CRIME,IS_TRAFFIC
REPORTED_DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2014-06-29 02:01:00,,,,,,,,
2014-06-29 01:54:00,,,,,,,,
2014-06-29 02:00:00,,,,,,,,
2014-06-29 02:18:00,,,,,,,,
2014-06-29 04:17:00,,,,,,,,
...,...,...,...,...,...,...,...,...
2017-09-13 05:48:00,460886.0,traffic-accident,traffic-accident,-104.924021,39.738229,hale,0.0,1.0
2017-09-12 20:37:00,460887.0,traffic-accident,traffic-accident,-104.912874,39.667630,hampden,0.0,1.0
2017-09-12 16:32:00,460888.0,theft-items-from-vehicle,theft-from-motor-vehicle,-104.981910,39.731684,capitol-hill,1.0,0.0
2017-09-12 13:04:00,460889.0,traffic-accident-hit-and-run,traffic-accident,-105.043950,39.787436,regis,0.0,1.0


In [105]:
##이동평균의 개념 rolling
df_1['IS_CRIME'].rolling(window=50).sum()

REPORTED_DATE
2014-06-29 02:01:00     NaN
2014-06-29 01:54:00     NaN
2014-06-29 02:00:00     NaN
2014-06-29 02:18:00     NaN
2014-06-29 04:17:00     NaN
                       ... 
2017-09-13 05:48:00    32.0
2017-09-12 20:37:00    32.0
2017-09-12 16:32:00    32.0
2017-09-12 13:04:00    33.0
2017-09-12 09:30:00    33.0
Name: IS_CRIME, Length: 460911, dtype: float64

### 필수과제1
- 시계열 데이터에 대한 전처리나 분석방법을 배웠으니
- IS_CRIME, IS_TRAFFIC 두 개의 컬럼을 가지고 시계열 데이터로 연도별, 매월, 분석을 진행하는 분석가의 목적에 따라 진행하면 됩니다. (기간은 )
- 지역별로의 어떤 범죄 유형이나, traffic 유형이 있는지를 간단하게 시각화해서 정리하면 좋을 것 같습니다.
- 산출물은 
    - crime
    - traffic
    - 이 두 가지 부분에 대해서 고민하고 시각화 코드로 정리 후, 인사이트는 주석 또는 마크다운으로 정리해 주세요.