【臺北大學】Python程式設計<br>
【授課老師】[陳祥輝 (Email : HsiangHui.Chen@gmail.com)](mailto:HsiangHui.Chen@gmail.com)<br>
【facebook】[陳祥輝老師的臉書 (歡迎加好友)](https://goo.gl/osivhx)<br>
【參考書籍】[從零開始學Python程式設計（適用Python 3.5以上）](http://www.drmaster.com.tw/Bookinfo.asp?BookID=MP31821)<br>
【主要議題】Date and Time

【重點提要】
- Python <font color=#0000FF>time</font> Module
- pandas有關日期的method
    - [pandas.date_range()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.date_range.html) : Return a fixed frequency DatetimeIndex.
    - [pandas.DatetimeIndex()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DatetimeIndex.html#pandas.DatetimeIndex) : An immutable container for datetimes.
    - [pandas.timedelta_range()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.timedelta_range.html#pandas.timedelta_range) : Return a fixed frequency TimedeltaIndex.
    - [pandas.period_range()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.period_range.html#pandas.period_range) : Return a fixed frequency PeriodIndex.
    - [pandas.interval_range()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.interval_range.html#pandas.interval_range) : Return a fixed frequency IntervalIndex.
- Python <font color=#0000FF>datetime</font> Module

【參考資料】
- [Offset Aliases](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases)
- [Python datetime Module](https://www.programiz.com/python-programming/datetime)

In [1]:
# -*- coding: utf-8 -*-
from platform import python_version
import os, time, socket
import pandas as pd
import numpy as np

print("【日期時間】{}".format(time.strftime("%Y-%m-%d %H:%M:%S")))
print("【工作目錄】{}".format(os.getcwd()))
print("【主機名稱】{} ({})".format(socket.gethostname(),socket.gethostbyname(socket.gethostname())))
print("【Python】{}".format(python_version()))

【日期時間】2021-04-20 14:21:38
【工作目錄】C:\Users\NTPU Computer Center\Desktop
【主機名稱】1MF08-06 (10.137.110.6)
【Python】3.8.5


### <font color=#0000FF>time.strftime()</font>
<pre>
%Y  Year with century as a decimal number.
%m  Month as a decimal number [01,12].
%d  Day of the month as a decimal number [01,31].
%H  Hour (24-hour clock) as a decimal number [00,23].
%M  Minute as a decimal number [00,59].
%S  Second as a decimal number [00,61].
%z  Time zone offset from UTC.
%a  Locale's abbreviated weekday name.
%A  Locale's full weekday name.
%b  Locale's abbreviated month name.
%B  Locale's full month name.
%c  Locale's appropriate date and time representation.
%I  Hour (12-hour clock) as a decimal number [01,12].
%p  Locale's equivalent of either AM or PM.
</pre>

### <font color=#0000FF>pandas.date_range()</font>

pandas.date_range(start=None, end=None, periods=None, freq=None, tz=None, normalize=False, name=None, closed=None, **kwargs)

- start : str or datetime-like, optional Left bound for generating dates.
- end : str or datetime-like, optional Right bound for generating dates.
- periods : integer, optional Number of periods to generate.
- freq : str or DateOffset, default ‘D’ Frequency strings can have multiples, e.g. ‘5H’. See here for a list of frequency aliases.
- tz : str or tzinfo, optional Time zone name for returning localized DatetimeIndex, for example ‘Asia/Hong_Kong’. By default, the resulting DatetimeIndex is timezone-naive.
- normalize : bool, default False Normalize start/end dates to midnight before generating date range.
- name : str, default None Name of the resulting DatetimeIndex.
- closed : {None, ‘left’, ‘right’}, optional Make the interval closed with respect to the given frequency to the ‘left’, ‘right’, or both sides (None, the default).
- \**kwargs For compatibility. Has no effect on the result.

---
[參考資料]
- [pandas.date_range](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.date_range.html)
- [Offset Aliases](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases)
- [Using the urllib.request Module](https://stackabuse.com/download-files-with-python/)

#### <font color=#0000FF>(1) freq 參數 : 給定起(start)、迄(end)，依據freq來取日期</font>
- [Offset Aliases](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases) <font color='red'>【★★★★☆】</font>

In [2]:
pd.date_range(start='2021-12-15', end='2022-03-15', freq='2D')

DatetimeIndex(['2021-12-15', '2021-12-17', '2021-12-19', '2021-12-21',
               '2021-12-23', '2021-12-25', '2021-12-27', '2021-12-29',
               '2021-12-31', '2022-01-02', '2022-01-04', '2022-01-06',
               '2022-01-08', '2022-01-10', '2022-01-12', '2022-01-14',
               '2022-01-16', '2022-01-18', '2022-01-20', '2022-01-22',
               '2022-01-24', '2022-01-26', '2022-01-28', '2022-01-30',
               '2022-02-01', '2022-02-03', '2022-02-05', '2022-02-07',
               '2022-02-09', '2022-02-11', '2022-02-13', '2022-02-15',
               '2022-02-17', '2022-02-19', '2022-02-21', '2022-02-23',
               '2022-02-25', '2022-02-27', '2022-03-01', '2022-03-03',
               '2022-03-05', '2022-03-07', '2022-03-09', '2022-03-11',
               '2022-03-13', '2022-03-15'],
              dtype='datetime64[ns]', freq='2D')

In [3]:
pd.date_range(start='2021-04-01', end='2021-05-15', freq='B')     # business day frequency

DatetimeIndex(['2021-04-01', '2021-04-02', '2021-04-05', '2021-04-06',
               '2021-04-07', '2021-04-08', '2021-04-09', '2021-04-12',
               '2021-04-13', '2021-04-14', '2021-04-15', '2021-04-16',
               '2021-04-19', '2021-04-20', '2021-04-21', '2021-04-22',
               '2021-04-23', '2021-04-26', '2021-04-27', '2021-04-28',
               '2021-04-29', '2021-04-30', '2021-05-03', '2021-05-04',
               '2021-05-05', '2021-05-06', '2021-05-07', '2021-05-10',
               '2021-05-11', '2021-05-12', '2021-05-13', '2021-05-14'],
              dtype='datetime64[ns]', freq='B')

In [4]:
pd.date_range(start='2020-12-15', end='2021-03-15', freq='M')     # month end frequency

DatetimeIndex(['2020-12-31', '2021-01-31', '2021-02-28'], dtype='datetime64[ns]', freq='M')

In [5]:
pd.date_range(start='2020-12-15', end='2021-03-15', freq='MS')    # month start frequency

DatetimeIndex(['2021-01-01', '2021-02-01', '2021-03-01'], dtype='datetime64[ns]', freq='MS')

#### <font color=#0000FF>(2) closed 參數控制是否包含起、迄</font>
- closed = None : 包含起、迄
- closed = 'left' : 包含起、不包含迄
- closed = 'right' : 不包含起、包含迄

In [7]:
pd.date_range(start='2020-12-15', end='2021-01-15', freq='1D',closed=None)

DatetimeIndex(['2020-12-15', '2020-12-16', '2020-12-17', '2020-12-18',
               '2020-12-19', '2020-12-20', '2020-12-21', '2020-12-22',
               '2020-12-23', '2020-12-24', '2020-12-25', '2020-12-26',
               '2020-12-27', '2020-12-28', '2020-12-29', '2020-12-30',
               '2020-12-31', '2021-01-01', '2021-01-02', '2021-01-03',
               '2021-01-04', '2021-01-05', '2021-01-06', '2021-01-07',
               '2021-01-08', '2021-01-09', '2021-01-10', '2021-01-11',
               '2021-01-12', '2021-01-13', '2021-01-14', '2021-01-15'],
              dtype='datetime64[ns]', freq='D')

In [8]:
pd.date_range(start='2020-12-15', end='2021-01-15', freq='1D',closed='left')

DatetimeIndex(['2020-12-15', '2020-12-16', '2020-12-17', '2020-12-18',
               '2020-12-19', '2020-12-20', '2020-12-21', '2020-12-22',
               '2020-12-23', '2020-12-24', '2020-12-25', '2020-12-26',
               '2020-12-27', '2020-12-28', '2020-12-29', '2020-12-30',
               '2020-12-31', '2021-01-01', '2021-01-02', '2021-01-03',
               '2021-01-04', '2021-01-05', '2021-01-06', '2021-01-07',
               '2021-01-08', '2021-01-09', '2021-01-10', '2021-01-11',
               '2021-01-12', '2021-01-13', '2021-01-14'],
              dtype='datetime64[ns]', freq='D')

In [9]:
pd.date_range(start='2020-12-15', end='2021-01-15', freq='1D',closed='right')

DatetimeIndex(['2020-12-16', '2020-12-17', '2020-12-18', '2020-12-19',
               '2020-12-20', '2020-12-21', '2020-12-22', '2020-12-23',
               '2020-12-24', '2020-12-25', '2020-12-26', '2020-12-27',
               '2020-12-28', '2020-12-29', '2020-12-30', '2020-12-31',
               '2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04',
               '2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08',
               '2021-01-09', '2021-01-10', '2021-01-11', '2021-01-12',
               '2021-01-13', '2021-01-14', '2021-01-15'],
              dtype='datetime64[ns]', freq='D')

#### <font color=#0000FF>(3) 給定 起(start)日期 或(且) 迄日期(end)，來取得幾個日期(periods)</font>
- [Offset Aliases](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases)</font>
- start / end 的設定方式
    - 只給 start : 往未來日期取
    - 只給 end : 往過去日期取
    - 同時給 start, end : 在start與end之間取等距個periods個日期

In [10]:
pd.date_range(start='2021-12-15', periods=8, freq='M',closed=None)

DatetimeIndex(['2021-12-31', '2022-01-31', '2022-02-28', '2022-03-31',
               '2022-04-30', '2022-05-31', '2022-06-30', '2022-07-31'],
              dtype='datetime64[ns]', freq='M')

In [11]:
pd.date_range(end='2021-12-15', periods=8, freq='M',closed=None)

DatetimeIndex(['2021-04-30', '2021-05-31', '2021-06-30', '2021-07-31',
               '2021-08-31', '2021-09-30', '2021-10-31', '2021-11-30'],
              dtype='datetime64[ns]', freq='M')

In [12]:
pd.date_range(start='2021-12-15', end='2022-04-30', periods=8, closed=None)

DatetimeIndex([          '2021-12-15 00:00:00',
               '2022-01-03 10:17:08.571428571',
               '2022-01-22 20:34:17.142857143',
               '2022-02-11 06:51:25.714285714',
               '2022-03-02 17:08:34.285714286',
               '2022-03-22 03:25:42.857142858',
               '2022-04-10 13:42:51.428571428',
                         '2022-04-30 00:00:00'],
              dtype='datetime64[ns]', freq=None)

### <font color=#0000FF>第三次作業(2021-04-20)</font>
請設定一個起、迄日期(例如 : 2020-12-15 ~ 2021-01-15 將以下 1. 的檔案下載至 C:\temp，下載時要將相對應的檔案儲存至相對應的目錄，例如 C:\temp\2020\12\2020-12-15.csv.gz，如果相對應的目錄不存在，程式要自動建立，不可使用手動建立。並自己想辦法將下載的檔案解壓縮，當然不可以使用人工進行解壓縮，下載方式可以參考 2. 的網址。
1. [CRAN package download logs](http://cran-logs.rstudio.com/)
2. [Using the urllib.request Module](https://stackabuse.com/download-files-with-python/)

【說明】
1. 下載的函數名稱 downloadFiles(srcURL, destPath, datesList)
2. 解壓縮的函數名稱 zipFiles(path)

In [None]:
import urllib.request, os
for adate in pd.date_range(start="2020-12-15", end = "2021-01-15", freq = "D"):
    if not os.path.exists(r"C:\temp\{}\{}".format(str(adate)[:4],str(adate)[5:7])):
        os.makedirs(r"C:\temp\{}\{}".format(str(adate)[:4],str(adate)[5:7]))
    urllib.request.urlretrieve("http://cran-logs.rstudio.com/{}/{}.csv.gz".format(str(adate)[:4],str(adate)[:10]), \
                               r"C:\temp\{}\{}\{}.csv.gz".format(str(adate)[:4],str(adate)[5:7],str(adate)[:10]))