<a href="https://colab.research.google.com/github/DahyeonS/Java_Python_Lecture/blob/main/20231229/pandas08.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 4.8 시계열 자료 다루기

```{margin}
`DatetimeIndex`
```

## `DatetimeIndex` 인덱스

시계열 자료는 인덱스가 날짜 혹은 시간인 데이터를 말한다. 판다스에서 시계열 자료를 생성하려면 인덱스를 `DatetimeIndex` 자료형으로 만들어야 한다. `DatetimeIndex`는 특정한 순간에 기록된 타임스탬프(timestamp) 형식의 시계열 자료를 다루기 위한 인덱스이다. 타임스탬프 인덱스의 라벨값이 반드시 일정한 간격일 필요는 없다.

`DatetimeIndex` 인덱스는 다음과 같은 보조 함수를 사용하여 생성한다.

* `pd.to_datetime` 함수
* `pd.date_range` 함수

```{margin}
to_datetime
```

`pd.to_datetime` 함수를 쓰면 날짜/시간을 나타내는 문자열을 자동으로 datetime 자료형으로 바꾼 후 `DatetimeIndex` 자료형 인덱스를 생성한다.

In [2]:
import pandas as pd
import numpy as np

In [3]:
date_str = ["2018, 1, 1", "2018, 1, 4", "2018, 1, 5", "2018, 1, 6"]
idx = pd.to_datetime(date_str)
idx

DatetimeIndex(['2018-01-01', '2018-01-04', '2018-01-05', '2018-01-06'], dtype='datetime64[ns]', freq=None)

이렇게 만들어진 인덱스를 사용하여 시리즈나 데이터프레임을 생성하면 된다.

In [4]:
np.random.seed(0)
s = pd.Series(np.random.randn(4), index=idx)
s

2018-01-01    1.764052
2018-01-04    0.400157
2018-01-05    0.978738
2018-01-06    2.240893
dtype: float64

```{margin}
date_range
```

`pd.date_range` 함수를 쓰면 모든 날짜/시간을 일일히 입력할 필요없이 시작일과 종료일 또는 시작일과 기간을 입력하면 범위 내의 인덱스를 생성해 준다.

In [5]:
pd.date_range("2018-4-1", "2018-4-30")

DatetimeIndex(['2018-04-01', '2018-04-02', '2018-04-03', '2018-04-04',
               '2018-04-05', '2018-04-06', '2018-04-07', '2018-04-08',
               '2018-04-09', '2018-04-10', '2018-04-11', '2018-04-12',
               '2018-04-13', '2018-04-14', '2018-04-15', '2018-04-16',
               '2018-04-17', '2018-04-18', '2018-04-19', '2018-04-20',
               '2018-04-21', '2018-04-22', '2018-04-23', '2018-04-24',
               '2018-04-25', '2018-04-26', '2018-04-27', '2018-04-28',
               '2018-04-29', '2018-04-30'],
              dtype='datetime64[ns]', freq='D')

In [6]:
pd.date_range(start="2018-4-1", periods=30)

DatetimeIndex(['2018-04-01', '2018-04-02', '2018-04-03', '2018-04-04',
               '2018-04-05', '2018-04-06', '2018-04-07', '2018-04-08',
               '2018-04-09', '2018-04-10', '2018-04-11', '2018-04-12',
               '2018-04-13', '2018-04-14', '2018-04-15', '2018-04-16',
               '2018-04-17', '2018-04-18', '2018-04-19', '2018-04-20',
               '2018-04-21', '2018-04-22', '2018-04-23', '2018-04-24',
               '2018-04-25', '2018-04-26', '2018-04-27', '2018-04-28',
               '2018-04-29', '2018-04-30'],
              dtype='datetime64[ns]', freq='D')

In [7]:
pd.date_range(start="2018-4-1", periods=35)

DatetimeIndex(['2018-04-01', '2018-04-02', '2018-04-03', '2018-04-04',
               '2018-04-05', '2018-04-06', '2018-04-07', '2018-04-08',
               '2018-04-09', '2018-04-10', '2018-04-11', '2018-04-12',
               '2018-04-13', '2018-04-14', '2018-04-15', '2018-04-16',
               '2018-04-17', '2018-04-18', '2018-04-19', '2018-04-20',
               '2018-04-21', '2018-04-22', '2018-04-23', '2018-04-24',
               '2018-04-25', '2018-04-26', '2018-04-27', '2018-04-28',
               '2018-04-29', '2018-04-30', '2018-05-01', '2018-05-02',
               '2018-05-03', '2018-05-04', '2018-05-05'],
              dtype='datetime64[ns]', freq='D')

In [11]:
pd.date_range(start="2020-2-1", periods=35)

DatetimeIndex(['2020-02-01', '2020-02-02', '2020-02-03', '2020-02-04',
               '2020-02-05', '2020-02-06', '2020-02-07', '2020-02-08',
               '2020-02-09', '2020-02-10', '2020-02-11', '2020-02-12',
               '2020-02-13', '2020-02-14', '2020-02-15', '2020-02-16',
               '2020-02-17', '2020-02-18', '2020-02-19', '2020-02-20',
               '2020-02-21', '2020-02-22', '2020-02-23', '2020-02-24',
               '2020-02-25', '2020-02-26', '2020-02-27', '2020-02-28',
               '2020-02-29', '2020-03-01', '2020-03-02', '2020-03-03',
               '2020-03-04', '2020-03-05', '2020-03-06'],
              dtype='datetime64[ns]', freq='D')

`freq` 인수로 특정한 날짜만 생성되도록 할 수도 있다. 많이 사용되는 `freq` 인수값은 다음과 같다.

* `s`: 초
* `T`: 분
* `H`: 시간
* `D`: 일(day)
* `B`: 주말이 아닌 평일
* `W`: 주(일요일)
* `W-MON`: 주(월요일)
* `M`: 각 달(month)의 마지막 날
* `MS`: 각 달의 첫날
* `BM`: 주말이 아닌 평일 중에서 각 달의 마지막 날
* `BMS`: 주말이 아닌 평일 중에서 각 달의 첫날
* `WOM-2THU`: 각 달의 두번째 목요일
* `Q-JAN`: 각 분기의 첫달의 마지막 날
* `Q-DEC`: 각 분기의 마지막 달의 마지막 날

보다 자세한 내용은 다음 웹사이트를 참조한다.

* [https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects)

In [12]:
pd.date_range("2018-4-1", "2018-4-30", freq="B")

DatetimeIndex(['2018-04-02', '2018-04-03', '2018-04-04', '2018-04-05',
               '2018-04-06', '2018-04-09', '2018-04-10', '2018-04-11',
               '2018-04-12', '2018-04-13', '2018-04-16', '2018-04-17',
               '2018-04-18', '2018-04-19', '2018-04-20', '2018-04-23',
               '2018-04-24', '2018-04-25', '2018-04-26', '2018-04-27',
               '2018-04-30'],
              dtype='datetime64[ns]', freq='B')

In [15]:
pd.date_range("2024-01-01", "2024-4-30", freq="B")

DatetimeIndex(['2024-01-01', '2024-01-02', '2024-01-03', '2024-01-04',
               '2024-01-05', '2024-01-08', '2024-01-09', '2024-01-10',
               '2024-01-11', '2024-01-12', '2024-01-15', '2024-01-16',
               '2024-01-17', '2024-01-18', '2024-01-19', '2024-01-22',
               '2024-01-23', '2024-01-24', '2024-01-25', '2024-01-26',
               '2024-01-29', '2024-01-30', '2024-01-31', '2024-02-01',
               '2024-02-02', '2024-02-05', '2024-02-06', '2024-02-07',
               '2024-02-08', '2024-02-09', '2024-02-12', '2024-02-13',
               '2024-02-14', '2024-02-15', '2024-02-16', '2024-02-19',
               '2024-02-20', '2024-02-21', '2024-02-22', '2024-02-23',
               '2024-02-26', '2024-02-27', '2024-02-28', '2024-02-29',
               '2024-03-01', '2024-03-04', '2024-03-05', '2024-03-06',
               '2024-03-07', '2024-03-08', '2024-03-11', '2024-03-12',
               '2024-03-13', '2024-03-14', '2024-03-15', '2024-03-18',
      

In [13]:
pd.date_range("2018-1-1", "2018-12-31", freq="W")

DatetimeIndex(['2018-01-07', '2018-01-14', '2018-01-21', '2018-01-28',
               '2018-02-04', '2018-02-11', '2018-02-18', '2018-02-25',
               '2018-03-04', '2018-03-11', '2018-03-18', '2018-03-25',
               '2018-04-01', '2018-04-08', '2018-04-15', '2018-04-22',
               '2018-04-29', '2018-05-06', '2018-05-13', '2018-05-20',
               '2018-05-27', '2018-06-03', '2018-06-10', '2018-06-17',
               '2018-06-24', '2018-07-01', '2018-07-08', '2018-07-15',
               '2018-07-22', '2018-07-29', '2018-08-05', '2018-08-12',
               '2018-08-19', '2018-08-26', '2018-09-02', '2018-09-09',
               '2018-09-16', '2018-09-23', '2018-09-30', '2018-10-07',
               '2018-10-14', '2018-10-21', '2018-10-28', '2018-11-04',
               '2018-11-11', '2018-11-18', '2018-11-25', '2018-12-02',
               '2018-12-09', '2018-12-16', '2018-12-23', '2018-12-30'],
              dtype='datetime64[ns]', freq='W-SUN')

In [17]:
pd.date_range("2024-1-1", "2024-12-31", freq="W")

DatetimeIndex(['2024-01-07', '2024-01-14', '2024-01-21', '2024-01-28',
               '2024-02-04', '2024-02-11', '2024-02-18', '2024-02-25',
               '2024-03-03', '2024-03-10', '2024-03-17', '2024-03-24',
               '2024-03-31', '2024-04-07', '2024-04-14', '2024-04-21',
               '2024-04-28', '2024-05-05', '2024-05-12', '2024-05-19',
               '2024-05-26', '2024-06-02', '2024-06-09', '2024-06-16',
               '2024-06-23', '2024-06-30', '2024-07-07', '2024-07-14',
               '2024-07-21', '2024-07-28', '2024-08-04', '2024-08-11',
               '2024-08-18', '2024-08-25', '2024-09-01', '2024-09-08',
               '2024-09-15', '2024-09-22', '2024-09-29', '2024-10-06',
               '2024-10-13', '2024-10-20', '2024-10-27', '2024-11-03',
               '2024-11-10', '2024-11-17', '2024-11-24', '2024-12-01',
               '2024-12-08', '2024-12-15', '2024-12-22', '2024-12-29'],
              dtype='datetime64[ns]', freq='W-SUN')

In [18]:
pd.date_range("2018-1-1", "2018-12-31", freq="W-MON")

DatetimeIndex(['2018-01-01', '2018-01-08', '2018-01-15', '2018-01-22',
               '2018-01-29', '2018-02-05', '2018-02-12', '2018-02-19',
               '2018-02-26', '2018-03-05', '2018-03-12', '2018-03-19',
               '2018-03-26', '2018-04-02', '2018-04-09', '2018-04-16',
               '2018-04-23', '2018-04-30', '2018-05-07', '2018-05-14',
               '2018-05-21', '2018-05-28', '2018-06-04', '2018-06-11',
               '2018-06-18', '2018-06-25', '2018-07-02', '2018-07-09',
               '2018-07-16', '2018-07-23', '2018-07-30', '2018-08-06',
               '2018-08-13', '2018-08-20', '2018-08-27', '2018-09-03',
               '2018-09-10', '2018-09-17', '2018-09-24', '2018-10-01',
               '2018-10-08', '2018-10-15', '2018-10-22', '2018-10-29',
               '2018-11-05', '2018-11-12', '2018-11-19', '2018-11-26',
               '2018-12-03', '2018-12-10', '2018-12-17', '2018-12-24',
               '2018-12-31'],
              dtype='datetime64[ns]', freq='W-M

In [19]:
pd.date_range("2018-4-1", "2018-12-31", freq="MS")

DatetimeIndex(['2018-04-01', '2018-05-01', '2018-06-01', '2018-07-01',
               '2018-08-01', '2018-09-01', '2018-10-01', '2018-11-01',
               '2018-12-01'],
              dtype='datetime64[ns]', freq='MS')

In [20]:
pd.date_range("2018-4-1", "2018-12-31", freq="M")

DatetimeIndex(['2018-04-30', '2018-05-31', '2018-06-30', '2018-07-31',
               '2018-08-31', '2018-09-30', '2018-10-31', '2018-11-30',
               '2018-12-31'],
              dtype='datetime64[ns]', freq='M')

In [21]:
pd.date_range("2018-4-1", "2018-12-31", freq="BMS")

DatetimeIndex(['2018-04-02', '2018-05-01', '2018-06-01', '2018-07-02',
               '2018-08-01', '2018-09-03', '2018-10-01', '2018-11-01',
               '2018-12-03'],
              dtype='datetime64[ns]', freq='BMS')

In [22]:
pd.date_range("2018-4-1", "2018-12-31", freq="BM")

DatetimeIndex(['2018-04-30', '2018-05-31', '2018-06-29', '2018-07-31',
               '2018-08-31', '2018-09-28', '2018-10-31', '2018-11-30',
               '2018-12-31'],
              dtype='datetime64[ns]', freq='BM')

In [23]:
pd.date_range("2018-1-1", "2018-12-31", freq="WOM-2THU")

DatetimeIndex(['2018-01-11', '2018-02-08', '2018-03-08', '2018-04-12',
               '2018-05-10', '2018-06-14', '2018-07-12', '2018-08-09',
               '2018-09-13', '2018-10-11', '2018-11-08', '2018-12-13'],
              dtype='datetime64[ns]', freq='WOM-2THU')

In [24]:
pd.date_range("2018-1-1", "2018-12-31", freq="Q-JAN")

DatetimeIndex(['2018-01-31', '2018-04-30', '2018-07-31', '2018-10-31'], dtype='datetime64[ns]', freq='Q-JAN')

In [25]:
pd.date_range("2018-1-1", "2018-12-31", freq="Q-DEC")

DatetimeIndex(['2018-03-31', '2018-06-30', '2018-09-30', '2018-12-31'], dtype='datetime64[ns]', freq='Q-DEC')

## `shift` 연산

시계열 데이터의 인덱스는 시간이나 날짜를 나타내기 때문에 날짜 이동 등의 다양한 연산이 가능하다. 예를 들어 `shift` 연산을 사용하면 인덱스는 그대로 두고 데이터만 이동할 수도 있다.

In [26]:
np.random.seed(0)
ts = pd.Series(np.random.randn(4), index=pd.date_range(
    "2018-1-1", periods=4, freq="M"))
ts

2018-01-31    1.764052
2018-02-28    0.400157
2018-03-31    0.978738
2018-04-30    2.240893
Freq: M, dtype: float64

In [27]:
ts.shift(1)

2018-01-31         NaN
2018-02-28    1.764052
2018-03-31    0.400157
2018-04-30    0.978738
Freq: M, dtype: float64

In [28]:
ts.shift(-1)

2018-01-31    0.400157
2018-02-28    0.978738
2018-03-31    2.240893
2018-04-30         NaN
Freq: M, dtype: float64

In [29]:
ts.shift(1, freq="M")

2018-02-28    1.764052
2018-03-31    0.400157
2018-04-30    0.978738
2018-05-31    2.240893
Freq: M, dtype: float64

In [30]:
ts.shift(1, freq="W")

2018-02-04    1.764052
2018-03-04    0.400157
2018-04-01    0.978738
2018-05-06    2.240893
dtype: float64

## `resample` 연산

`resample` 연산을 쓰면 시간 간격을 재조정하는 리샘플링(resampling)이 가능하다. 이 때 시간 구간이 작아지면 데이터 양이 증가한다고 해서 업-샘플링(up-sampling)이라 하고 시간 구간이 커지면 데이터 양이 감소한다고 해서 다운-샘플링(down-sampling)이라 부른다.

In [35]:
ts = pd.Series(np.random.randn(100), index=pd.date_range(
    "2024-1-1", periods=100, freq="D"))
ts.tail(20)

2024-03-21    0.862596
2024-03-22   -2.655619
2024-03-23    1.513328
2024-03-24    0.553132
2024-03-25   -0.045704
2024-03-26    0.220508
2024-03-27   -1.029935
2024-03-28   -0.349943
2024-03-29    1.100284
2024-03-30    1.298022
2024-03-31    2.696224
2024-04-01   -0.073925
2024-04-02   -0.658553
2024-04-03   -0.514234
2024-04-04   -1.018042
2024-04-05   -0.077855
2024-04-06    0.382732
2024-04-07   -0.034242
2024-04-08    1.096347
2024-04-09   -0.234216
Freq: D, dtype: float64

다운-샘플링의 경우에는 원래의 데이터가 그룹으로 묶이기 때문에 그룹바이(groupby)때와 같이 그룹 연산을 해서 대표값을 구해야 한다.

In [36]:
ts.resample('W').mean()

2024-01-07   -0.425550
2024-01-14    0.455404
2024-01-21    0.378618
2024-01-28   -0.555120
2024-02-04    0.302427
2024-02-11   -0.086757
2024-02-18   -0.167352
2024-02-25    0.016776
2024-03-03   -0.115010
2024-03-10   -0.910216
2024-03-17    0.795141
2024-03-24   -0.004315
2024-03-31    0.555636
2024-04-07   -0.284874
2024-04-14    0.431066
Freq: W-SUN, dtype: float64

In [37]:
ts.resample('M').first()

2024-01-31   -1.768538
2024-02-29    0.314817
2024-03-31   -0.502817
2024-04-30   -0.073925
Freq: M, dtype: float64

날짜가 아닌 시/분 단위에서는 구간위 왼쪽 한계값(가장 빠른 값)은 포함하고 오른쪽 한계값(가장 늦은 값)은 포함하지 않는다. 즉, 가장 늦은 값은 다음 구간에 포함된다. 예를 들어 10분 간격으로 구간을 만들면 10의 배수가 되는 시각은 구간의 시작점이 된다.

In [43]:
ts = pd.Series(np.random.randn(60), index=pd.date_range(
    "2024-1-1", periods=60, freq="T"))
ts.head(20)

2024-01-01 00:00:00   -0.729045
2024-01-01 00:01:00    0.196557
2024-01-01 00:02:00    0.354758
2024-01-01 00:03:00    0.616887
2024-01-01 00:04:00    0.008628
2024-01-01 00:05:00    0.527004
2024-01-01 00:06:00    0.453782
2024-01-01 00:07:00   -1.829740
2024-01-01 00:08:00    0.037006
2024-01-01 00:09:00    0.767902
2024-01-01 00:10:00    0.589880
2024-01-01 00:11:00   -0.363859
2024-01-01 00:12:00   -0.805627
2024-01-01 00:13:00   -1.118312
2024-01-01 00:14:00   -0.131054
2024-01-01 00:15:00    1.133080
2024-01-01 00:16:00   -1.951804
2024-01-01 00:17:00   -0.659892
2024-01-01 00:18:00   -1.139802
2024-01-01 00:19:00    0.784958
Freq: T, dtype: float64

In [44]:
ts.resample('10T').sum()

2024-01-01 00:00:00    0.403739
2024-01-01 00:10:00   -3.662432
2024-01-01 00:20:00   -4.556258
2024-01-01 00:30:00   -2.903870
2024-01-01 00:40:00   -5.576132
2024-01-01 00:50:00    0.373248
Freq: 10T, dtype: float64

왼쪽이 아니라 오른쪽 한계값을 구간에 포함하려면 `closed="right"` 인수를 사용한다. 이 때는 10의 배수가 되는 시각이 앞 구간에 포함된다.

In [45]:
ts.resample('10T', closed="right").sum()

2023-12-31 23:50:00   -0.729045
2024-01-01 00:00:00    1.722663
2024-01-01 00:10:00   -4.806622
2024-01-01 00:20:00   -3.650169
2024-01-01 00:30:00   -4.661613
2024-01-01 00:40:00   -4.302078
2024-01-01 00:50:00    0.505157
Freq: 10T, dtype: float64

`ohlc` 메서드는 구간의 시고저종(open, high, low, close)값을 구한다.

In [46]:
ts.resample('5T').ohlc()

Unnamed: 0,open,high,low,close
2024-01-01 00:00:00,-0.729045,0.616887,-0.729045,0.008628
2024-01-01 00:05:00,0.527004,0.767902,-1.82974,0.767902
2024-01-01 00:10:00,0.58988,0.58988,-1.118312,-0.131054
2024-01-01 00:15:00,1.13308,1.13308,-1.951804,0.784958
2024-01-01 00:20:00,-0.55431,0.445393,-0.55431,-0.392389
2024-01-01 00:25:00,-3.046143,0.543312,-3.046143,-1.084037
2024-01-01 00:30:00,0.35178,0.379236,-0.930157,-0.930157
2024-01-01 00:35:00,-0.178589,0.417319,-1.550429,0.238103
2024-01-01 00:40:00,-1.405963,0.115148,-1.6607,0.115148
2024-01-01 00:45:00,-0.379148,0.895556,-1.742356,0.895556


업-샘플링의 경우에는 실제로 존재하지 않는 데이터를 만들어야 한다. 이 때는 앞에서 나온 데이터를 뒤에서 그대로 쓰는 forward filling 방식과 뒤에서 나올 데이터를 앞에서 미리 쓰는 backward filling 방식을 사용할 수 있다. 각각 `ffill`, `bfill` 메서드를 이용한다.

In [47]:
ts.resample('30s').ffill().head(20)

2024-01-01 00:00:00   -0.729045
2024-01-01 00:00:30   -0.729045
2024-01-01 00:01:00    0.196557
2024-01-01 00:01:30    0.196557
2024-01-01 00:02:00    0.354758
2024-01-01 00:02:30    0.354758
2024-01-01 00:03:00    0.616887
2024-01-01 00:03:30    0.616887
2024-01-01 00:04:00    0.008628
2024-01-01 00:04:30    0.008628
2024-01-01 00:05:00    0.527004
2024-01-01 00:05:30    0.527004
2024-01-01 00:06:00    0.453782
2024-01-01 00:06:30    0.453782
2024-01-01 00:07:00   -1.829740
2024-01-01 00:07:30   -1.829740
2024-01-01 00:08:00    0.037006
2024-01-01 00:08:30    0.037006
2024-01-01 00:09:00    0.767902
2024-01-01 00:09:30    0.767902
Freq: 30S, dtype: float64

In [48]:
ts.resample('30s').bfill().head(20)

2024-01-01 00:00:00   -0.729045
2024-01-01 00:00:30    0.196557
2024-01-01 00:01:00    0.196557
2024-01-01 00:01:30    0.354758
2024-01-01 00:02:00    0.354758
2024-01-01 00:02:30    0.616887
2024-01-01 00:03:00    0.616887
2024-01-01 00:03:30    0.008628
2024-01-01 00:04:00    0.008628
2024-01-01 00:04:30    0.527004
2024-01-01 00:05:00    0.527004
2024-01-01 00:05:30    0.453782
2024-01-01 00:06:00    0.453782
2024-01-01 00:06:30   -1.829740
2024-01-01 00:07:00   -1.829740
2024-01-01 00:07:30    0.037006
2024-01-01 00:08:00    0.037006
2024-01-01 00:08:30    0.767902
2024-01-01 00:09:00    0.767902
2024-01-01 00:09:30    0.589880
Freq: 30S, dtype: float64

## `dt` 접근자

datetime 자료형 시리즈에는 dt 접근자가 있어 datetime 자료형이 가진 몇가지 유용한 속성과 메서드를 사용할 수 있다.

In [49]:
s = pd.Series(pd.date_range("2020-12-25", periods=100, freq="D"))
s

0    2020-12-25
1    2020-12-26
2    2020-12-27
3    2020-12-28
4    2020-12-29
        ...    
95   2021-03-30
96   2021-03-31
97   2021-04-01
98   2021-04-02
99   2021-04-03
Length: 100, dtype: datetime64[ns]

예를 들어 `year`, `month`, `day`, `weekday` 등의 속성을 이용하면 년, 월, 일, 요일 정보를 빼낼 수 있다.

In [50]:
s.dt.year

0     2020
1     2020
2     2020
3     2020
4     2020
      ... 
95    2021
96    2021
97    2021
98    2021
99    2021
Length: 100, dtype: int64

In [51]:
s.dt.weekday

0     4
1     5
2     6
3     0
4     1
     ..
95    1
96    2
97    3
98    4
99    5
Length: 100, dtype: int64

`strftime` 메서드를 이용하여 문자열을 만드는 것도 가능하다.

In [52]:
s.dt.strftime("%Y년 %m월 %d일")

0     2020년 12월 25일
1     2020년 12월 26일
2     2020년 12월 27일
3     2020년 12월 28일
4     2020년 12월 29일
          ...      
95    2021년 03월 30일
96    2021년 03월 31일
97    2021년 04월 01일
98    2021년 04월 02일
99    2021년 04월 03일
Length: 100, dtype: object

In [54]:
s.dt.strftime("%m/%d/%Y")

0     12/25/2020
1     12/26/2020
2     12/27/2020
3     12/28/2020
4     12/29/2020
         ...    
95    03/30/2021
96    03/31/2021
97    04/01/2021
98    04/02/2021
99    04/03/2021
Length: 100, dtype: object

````{admonition} 연습 문제 4.8.1

다음 명령으로 만들어진 데이터프레임에 대해 월별 value의 합계를 구하라.
(힌트: `groupby` 메서드와 `dt` 접근자를 사용하라)

```
np.random.seed(0)
df = pd.DataFrame({
    "date": pd.date_range("2020-12-25", periods=100, freq="D"),
    "value": np.random.randint(100, size=(100,))
})
```
````