# Chapter 2. Ingesting Data into the Cloud

* 싸이그래머 / CloudΨ - DS on GCP [1]
* 김무성

# 차례 
* 준비단계    
* Airline On-Time Perfomance Data
    * Knowability
    * Training–Serving Skew
    * Download Procedure
    * Dataset Attributes
* Why Not Store the Data in Situ?
    * Scaling Up
    * Scaling Out
    * Data in Situ with Colossus and Jupiter
* Ingesting Data
    * Reverse Engineering a Web Form
    * Dataset Download
    * Exploration and Cleanup
    * Uploading Data to Google Cloud Storage
* Scheduling Monthly Downloads
    * Ingesting in Python
    * Flask Web App
    * Running on App Engine
    * Securing the URL
    * Scheduling a Cron Task
* Summary
* Code Break

----------------------------

#### 참고 
* Getting started with Google Cloud Training Material - 2018 - https://www.slideshare.net/jkbaseer/getting-started-with-google-cloud-training-material-2018

<img src="https://image.slidesharecdn.com/cloudonboardtrainingmanual2018sg17april2018-180417132701/95/getting-started-with-google-cloud-training-material-2018-17-1024.jpg?cb=1523971737" width=800 />
<img src="https://image.slidesharecdn.com/cloudonboardtrainingmanual2018sg17april2018-180417132701/95/getting-started-with-google-cloud-training-material-2018-18-1024.jpg?cb=1523971737" width=800 />
<img src="https://image.slidesharecdn.com/cloudonboardtrainingmanual2018sg17april2018-180417132701/95/getting-started-with-google-cloud-training-material-2018-20-1024.jpg?cb=1523971737" width=800 />

-----------------------------------

## Chapter 1. - probabilistic decision criterion

<img src="../ch01/figures/cap02.png" width=600 />
<img src="../ch01/figures/cap03.png" width=600 />
<img src="../ch01/figures/cap04.png" width=600 />
<img src="../ch01/figures/cap05.png" width=600 />
<img src="../ch01/figures/cap06.png" width=600 />

---------------------

# 준비단계

* US Bureau of Transportation Statistics (BTS) 
    - https://www.transtats.bts.gov/
    - 미국 교통 통계국
    - 1987년 이후부터 여러 항공편의 과거이력 데이터가 있다. 
    - <font color="red">Airline On-Time Performance Data</font>
        - 정시 도착 퍼포먼스 데이터이므로, 
        - 비행 지연에 대한 정보를 포함한다.
        - 이 책에서는 이걸 쓸 것이다.



In [11]:
!ls

02_Ingesting_Data_into_the_Cloud.ipynb [34mfigures[m[m


In [12]:
!ls ../data-science-on-gcp/02_ingest/

README.md                [31mingest_from_crsbucket.sh[m[m [31mupload.sh[m[m
[31mdownload.sh[m[m              [34mmonthlyupdate[m[m            [31mzip_to_csv.sh[m[m
[31mingest.sh[m[m                [31mquotes_comma.sh[m[m


In [13]:
!cp ../data-science-on-gcp/02_ingest/download.sh .

In [14]:
!ls

02_Ingesting_Data_into_the_Cloud.ipynb [34mfigures[m[m
[31mdownload.sh[m[m


In [None]:
# 복사한 download.sh 파일의 다음 부분을

In [19]:
!head -n 5 download.sh

#!/bin/bash

export YEAR=${YEAR:=2015}
echo "Downloading YEAR=$YEAR..."



In [None]:
# {YEAR:=2015}를 {YEAR:=2018}로 바꿔서 

In [23]:
!head -n 5 download.sh

#!/bin/bash

export YEAR=${YEAR:=2018}
echo "Downloading YEAR=$YEAR..."



In [None]:
# 스크립트를 실행해보자.

In [39]:
!sh download.sh

Downloading YEAR=2018...
201801
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  5405  100   191  100  5214     10    299  0:00:19  0:00:17  0:00:02    44--:--  2472
Received <head><title>Object moved</title></head>
<body><h1>Object Moved</h1>This object may be found <a HREF="https://transtats.bts.gov/ftproot/TranStatsData/137304439_T_ONTIME.zip">here</a>.</body>
https://transtats.bts.gov/ftproot/TranStatsData/137304439_T_ONTIME.zip
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 18.3M  100 18.3M    0     0   140k      0  0:02:14  0:02:14 --:--:--  112k  0:00:14  0:01:57  223k 0:02:20  0:01:03  0:01:17  101k01:31  0:00:47  100k
201802
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   

In [31]:
!ls

02_Ingesting_Data_into_the_Cloud.ipynb [31mdownload.sh[m[m
201801.zip                             [34mfigures[m[m


In [32]:
!ls ../data-science-on-gcp/02_ingest/

README.md                [31mingest_from_crsbucket.sh[m[m [31mupload.sh[m[m
[31mdownload.sh[m[m              [34mmonthlyupdate[m[m            [31mzip_to_csv.sh[m[m
[31mingest.sh[m[m                [31mquotes_comma.sh[m[m


In [33]:
!cp ../data-science-on-gcp/02_ingest/zip_to_csv.sh .

In [34]:
!ls

02_Ingesting_Data_into_the_Cloud.ipynb [34mfigures[m[m
201801.zip                             [31mzip_to_csv.sh[m[m
[31mdownload.sh[m[m


In [None]:
# 복사한 zip_to_csv.sh 파일의 다음 부분을

In [35]:
!head -n 5 zip_to_csv.sh

#!/bin/bash
echo ${YEAR:=2015}  # default if YEAR not set
for month in `seq -w 1 12`; do 
   unzip $YEAR$month.zip
   mv *ONTIME.csv $YEAR$month.csv


In [None]:
# {YEAR:=2015}를 {YEAR:=2018}로 바꿔서 

In [36]:
!head -n 5 zip_to_csv.sh

#!/bin/bash
echo ${YEAR:=2018}  # default if YEAR not set
for month in `seq -w 1 12`; do 
   unzip $YEAR$month.zip
   mv *ONTIME.csv $YEAR$month.csv


In [None]:
# 스크립트를 실행해보자.

In [40]:
!sh zip_to_csv.sh

2018
Archive:  201801.zip
  inflating: 137304439_T_ONTIME.csv  
Archive:  201802.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of 201802.zip or
        201802.zip.zip, and cannot find 201802.zip.ZIP, period.
mv: rename *ONTIME.csv to 201802.csv: No such file or directory
unzip:  cannot find or open 201803.zip, 201803.zip.zip or 201803.zip.ZIP.
mv: rename *ONTIME.csv to 201803.csv: No such file or directory
rm: 201803.zip: No such file or directory
unzip:  cannot find or open 201804.zip, 201804.zip.zip or 201804.zip.ZIP.
mv: rename *ONTIME.csv to 201804.csv: No such file or directory
rm: 201804.zip: No such file or directory
unzip:  cannot find or open 201805.zip, 201805.zip.zip or 201805.zip.ZIP.
mv: rename *ONTIME.csv to 201805.csv

In [41]:
!ls

02_Ingesting_Data_into_the_Cloud.ipynb [34mfigures[m[m
201801.csv                             [31mzip_to_csv.sh[m[m
[31mdownload.sh[m[m


In [43]:
!head 201801.csv

"FL_DATE","UNIQUE_CARRIER","AIRLINE_ID","CARRIER","FL_NUM","ORIGIN_AIRPORT_ID","ORIGIN_AIRPORT_SEQ_ID","ORIGIN_CITY_MARKET_ID","ORIGIN","DEST_AIRPORT_ID","DEST_AIRPORT_SEQ_ID","DEST_CITY_MARKET_ID","DEST","CRS_DEP_TIME","DEP_TIME","DEP_DELAY","TAXI_OUT","WHEELS_OFF","WHEELS_ON","TAXI_IN","CRS_ARR_TIME","ARR_TIME","ARR_DELAY","CANCELLED","CANCELLATION_CODE","DIVERTED","DISTANCE",
2018-01-01,"UA",19977,"UA","2429",11618,1161802,31703,"EWR",11292,1129202,30325,"DEN","1517","1512",-5.00,15.00,"1527","1712",10.00,"1745","1722",-23.00,0.00,"",0.00,1605.00,
2018-01-01,"UA",19977,"UA","2427",12889,1288903,32211,"LAS",14771,1477104,32457,"SFO","1115","1107",-8.00,11.00,"1118","1223",7.00,"1254","1230",-24.00,0.00,"",0.00,414.00,
2018-01-01,"UA",19977,"UA","2426",14908,1490803,32575,"SNA",11292,1129202,30325,"DEN","1335","1330",-5.00,15.00,"1345","1631",5.00,"1649","1636",-13.00,0.00,"",0.00,846.00,
2018-01-01,"UA",19977,"UA","2425",14635,1463502,31714,"RSW",13930,1393006,30977,"ORD","1546",

In [46]:
!head -2 201801.csv  | tail -1 | sed 's/,/ /g' | wc -w

      27


In [47]:
!wc -l *.csv

  621614 201801.csv


In [48]:
import pandas as pd

In [49]:
fn = '201801.csv'
d = pd.read_csv(fn)

In [50]:
d.head()

Unnamed: 0,FL_DATE,UNIQUE_CARRIER,AIRLINE_ID,CARRIER,FL_NUM,ORIGIN_AIRPORT_ID,ORIGIN_AIRPORT_SEQ_ID,ORIGIN_CITY_MARKET_ID,ORIGIN,DEST_AIRPORT_ID,...,WHEELS_ON,TAXI_IN,CRS_ARR_TIME,ARR_TIME,ARR_DELAY,CANCELLED,CANCELLATION_CODE,DIVERTED,DISTANCE,Unnamed: 27
0,2018-01-01,UA,19977,UA,2429,11618,1161802,31703,EWR,11292,...,1712.0,10.0,1745,1722.0,-23.0,0.0,,0.0,1605.0,
1,2018-01-01,UA,19977,UA,2427,12889,1288903,32211,LAS,14771,...,1223.0,7.0,1254,1230.0,-24.0,0.0,,0.0,414.0,
2,2018-01-01,UA,19977,UA,2426,14908,1490803,32575,SNA,11292,...,1631.0,5.0,1649,1636.0,-13.0,0.0,,0.0,846.0,
3,2018-01-01,UA,19977,UA,2425,14635,1463502,31714,RSW,13930,...,1748.0,6.0,1756,1754.0,-2.0,0.0,,0.0,1120.0,
4,2018-01-01,UA,19977,UA,2424,13930,1393006,30977,ORD,10257,...,926.0,10.0,922,936.0,14.0,0.0,,0.0,723.0,


In [51]:
d.describe()

Unnamed: 0,AIRLINE_ID,FL_NUM,ORIGIN_AIRPORT_ID,ORIGIN_AIRPORT_SEQ_ID,ORIGIN_CITY_MARKET_ID,DEST_AIRPORT_ID,DEST_AIRPORT_SEQ_ID,DEST_CITY_MARKET_ID,CRS_DEP_TIME,DEP_TIME,...,WHEELS_OFF,WHEELS_ON,TAXI_IN,CRS_ARR_TIME,ARR_TIME,ARR_DELAY,CANCELLED,DIVERTED,DISTANCE,Unnamed: 27
count,621613.0,621613.0,621613.0,621613.0,621613.0,621613.0,621613.0,621613.0,621613.0,602890.0,...,601762.0,601228.0,601219.0,621613.0,602080.0,600857.0,621613.0,621613.0,621613.0,0.0
mean,20023.584985,2725.570397,12683.124867,1268316.0,31769.210683,12682.911176,1268295.0,31769.205091,1326.569938,1333.335882,...,1359.80828,1476.55199,7.490016,1493.005754,1482.041353,3.173973,0.030772,0.002263,761.128849,
std,410.893628,1913.261489,1517.027372,151702.5,1306.731354,1516.957095,151695.5,1306.656269,483.627437,493.226616,...,493.039381,514.968248,5.906399,507.46652,518.583014,49.643814,0.172698,0.047522,582.64456,
min,19393.0,1.0,10135.0,1013505.0,30070.0,10135.0,1013505.0,30070.0,1.0,1.0,...,1.0,1.0,0.0,1.0,1.0,-1290.0,0.0,0.0,16.0,
25%,19790.0,1044.0,11292.0,1129202.0,30721.0,11292.0,1129202.0,30721.0,916.0,924.0,...,941.0,1059.0,4.0,1109.0,1103.0,-17.0,0.0,0.0,337.0,
50%,19977.0,2239.0,12889.0,1288903.0,31453.0,12889.0,1288903.0,31453.0,1320.0,1329.0,...,1343.0,1511.0,6.0,1520.0,1515.0,-8.0,0.0,0.0,599.0,
75%,20378.0,4444.0,14057.0,1405702.0,32575.0,14057.0,1405702.0,32575.0,1730.0,1737.0,...,1753.0,1909.0,9.0,1915.0,1914.0,6.0,0.0,0.0,1005.0,
max,21171.0,9375.0,16218.0,1621801.0,36133.0,16218.0,1621801.0,36133.0,2359.0,2400.0,...,2400.0,2400.0,258.0,2359.0,2400.0,2023.0,1.0,1.0,4983.0,


--------------------------------

# 실습 1 - Google Cloud Shell 웹 브라우저 콘솔에서 소스 열기

1. 지난 시간에 GCP 프로젝트를 만들고, 책의 실습 코드가 다운로드되어있다는 가정하에 Google Cloud Shell을 연다
    *  https://console.cloud.google.com/
    
2. Google Cloud Shell 웹 브라우저 콘솔에서 준비단계에서 했던 것들을 직접 해본다.
    * 이때 소스 코드들이 있는 data-science-on-gcp 안에
    * data라는 디렉토리를 만들고
    * data 디렉토리로 이동해서
    * 스크립트들을 실행한다 
    
    예)
```shell
bash ../02_ingest/download.sh
```
    
    

----------------------------

# Airline On-Time Perfomance Data
* Knowability
* Training–Serving Skew
* Download Procedure
* Dataset Attributes

<img src="figures/cap01.png" width=600 />

## Knowability

## Training–Serving Skew

## Download Procedure

<img src="figures/cap02.png" width=600 />

## Dataset Attributes

----------------------------

# Why Not Store the Data in Situ?
* Scaling Up
* Scaling Out
* Data in Situ with Colossus and Jupiter

<img src="figures/cap03.png" width=600 />

## Scaling Up

<img src="figures/cap04.png" width=600 />

## Scaling Out

<img src="figures/cap05.png" width=600 />

## Data in Situ with Colossus and Jupiter

<img src="figures/cap06.png" width=600 />

-----------------------

# Ingesting Data
* Reverse Engineering a Web Form
* Dataset Download
* Exploration and Cleanup
* Uploading Data to Google Cloud Storage   

<img src="figures/cap07.png" width=600 />

## Reverse Engineering a Web Form

<img src="figures/cap08.png" width=600 />

<img src="figures/cap09.png" width=600 />

<img src="figures/cap10.png" width=600 />

<img src="figures/cap11.png" width=600 />

## Dataset Download

<img src="figures/cap12.png" width=600 />

<img src="figures/cap13.png" width=600 />

## Exploration and Cleanup

## Uploading Data to Google Cloud Storage

#### 참고자료
구글 클라우드 스토리지
* https://cloud.google.com/storage/
* http://jybaek.tistory.com/642

#### 생성
1. 구글 클라우드 스토리지 콘솔로 이동
    * https://console.cloud.google.com/storage/


2. 최초 버킷 생성
<img src="figures/storage_bucket.png">

3. 다음 형태로 콘솔에서 gsutils를 이용해 명령어들을 사용할 수 있다.
    * 조회 
```shell
gsutil ls -a gs://[BUCKET_NAME]
```
    * 업로드
```shell    
gsutil -m cp *.csv gs://cloud-training-demos-ml/flights/raw/’
```

In [None]:
# 조회

In [56]:
!gsutil ls gs://[BUCKET_NAME]

In [58]:
!gsutil ls gs://ds-on-gcp_cloudpsy_ms

gs://ds-on-gcp_cloudpsy_ms/43563921_1814266975383729_223249561072697344_n.jpg


In [None]:
# 웹 브라우저로 파일 하나 올려보기 (그리고 확인)

In [None]:
!gsutil ls gs://[BUCKET_NAME]

In [None]:
# 데이터 콘솔로 올려보기

-----------------------

# Scheduling Monthly Downloads
* Ingesting in Python
* Flask Web App
* Running on App Engine
* Securing the URL
* Scheduling a Cron Task

<img src="figures/cap14.png" width=600 />

<img src="https://image.slidesharecdn.com/cloudonboardtrainingmanual2018sg17april2018-180417132701/95/getting-started-with-google-cloud-training-material-2018-25-1024.jpg?cb=1523971737" width=800 />

# 실습 2 - Google App Engine을 이용해 주기적으로 데이터를 가져와서 스토리지에 저장하도록 Cron 서비스에 등록하기.

Google Cloud Shell로 이동

https://console.cloud.google.com/

1. Go to the <b>02_ingest/monthlyupdate</b> folder in the repo.<br><br>

2. Initialize a default App Engine application in your project by running <b>./init_appengine.sh</b>.<br><br>

3. Open the file <b>app.yaml</b> and change the <b>CLOUD_STORAGE_BUCKET</b> to reflect the name of your bucket.<br><br>

4. Run <b>./deploy.sh</b> to deploy the Cron service app. This will take 5 to 10 minutes.

5. Visit the Google Cloud Platform web console and navigate to the App Engine section. 
    * You should see two services: 
        - one the default (which is just a Hello World application) and 
        - the other is the flights service.<br><br>
        
6. Click the <b>flights</b> service, follow the link to ingest the data, and you’ll find that your access is forbidden—the ingest capability is available only to the Cron service (or from the Google Cloud Platform web console by clicking the “Run now” button in the task queues section of App Engine). If you click <b>“Run now,”</b> a few minutes later, you’ll see the next month’s data show up in the storage bucket.<br><br>

7. Stop the flights application—you won’t need it any further.<br><br>

8. Because software changes, an up-to-date list of the preceding steps is available in the course repository in <b>02_ingest/README.md</b>. This is true for all the following chapters.<br><br>

## Ingesting in Python

1. Download data from the BTS website to a local file.<br><br>

2. Unzip the downloaded ZIP file and extract the CSV file it contains.<br><br>

3. Remove quotes and the trailing comma from the CSV file.<br><br>

4. Upload the CSV file to Google Cloud Storage.<br><br>

In [5]:
!ls ../data-science-on-gcp/02_ingest/monthlyupdate/

app.yaml          cron_testing.yaml [31mingest_flights.py[m[m [31minit_appengine.sh[m[m
cron.yaml         [31mdeploy.sh[m[m         [31mingestapp.py[m[m      requirements.txt


In [6]:
import os

In [7]:
os.chdir('../data-science-on-gcp/02_ingest/monthlyupdate/')

In [8]:
%ls

app.yaml           cron_testing.yaml  [31mingest_flights.py[m[m* [31minit_appengine.sh[m[m*
cron.yaml          [31mdeploy.sh[m[m*         [31mingestapp.py[m[m*      requirements.txt


In [9]:
%cat ingest_flights.py

#!/usr/bin/env python

# Copyright 2016 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os
import shutil
import logging
import os.path
import zipfile
import datetime
import tempfile
from urllib2 import urlopen
from google.cloud import storage
from google.cloud.storage import Blob

def download(YEAR, MONTH, destdir):
   '''
     Downloads on-time performance data and returns local filename
     YEAR e.g.'2015'
     MONTH e.g. '01 for January
   ''

## Flask Web App

In [10]:
%ls

app.yaml           cron_testing.yaml  [31mingest_flights.py[m[m* [31minit_appengine.sh[m[m*
cron.yaml          [31mdeploy.sh[m[m*         [31mingestapp.py[m[m*      requirements.txt


In [12]:
%cat ingestapp.py

#!/usr/bin/env python
# Copyright 2016 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# [START app]
import os
import logging
import ingest_flights

import flask

# [start config]
app = flask.Flask(__name__)
# Configure this environment variable via app.yaml
CLOUD_STORAGE_BUCKET = os.environ['CLOUD_STORAGE_BUCKET']
#
logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.INFO)
# [end config]

@app.route('/')
def welcome():
         retur

## Running on App Engine

In [13]:
%ls

app.yaml           cron_testing.yaml  [31mingest_flights.py[m[m* [31minit_appengine.sh[m[m*
cron.yaml          [31mdeploy.sh[m[m*         [31mingestapp.py[m[m*      requirements.txt


In [11]:
%cat app.yaml

runtime: python
env: flex
entrypoint: gunicorn -b :$PORT ingestapp:app
service: flights
manual_scaling:
  instances: 1

#[START env]
env_variables:
    CLOUD_STORAGE_BUCKET: cloud-training-demos-ml
#[END env]

handlers:
- url: /ingest
  script: ingestapp.app

- url: /.*
  script: ingestapp.app


## Securing the URL

In [14]:
%ls

app.yaml           cron_testing.yaml  [31mingest_flights.py[m[m* [31minit_appengine.sh[m[m*
cron.yaml          [31mdeploy.sh[m[m*         [31mingestapp.py[m[m*      requirements.txt


In [15]:
%cat ingestapp.py

#!/usr/bin/env python
# Copyright 2016 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# [START app]
import os
import logging
import ingest_flights

import flask

# [start config]
app = flask.Flask(__name__)
# Configure this environment variable via app.yaml
CLOUD_STORAGE_BUCKET = os.environ['CLOUD_STORAGE_BUCKET']
#
logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.INFO)
# [end config]

@app.route('/')
def welcome():
         retur

#### 앱엔진의 cron job이 아닌 방식으로 요청이 들어오면 에러를 낸다.

```python
@app.route('/ingest')
def ingest_next_month():
    try:
         # verify that this is a cron job request
         is_cron = flask.request.headers['X-Appengine-Cron']
         logging.info('Received cron request {}'.format(is_cron))

         # next month
         bucket = CLOUD_STORAGE_BUCKET
         year, month = ingest_flights.next_month(bucket)
         status = 'scheduling ingest of year={} month={}'.format(year, month)
         logging.info(status)

         # ingest ...
         gcsfile = ingest_flights.ingest(year, month, bucket)
         status = 'successfully ingested={}'.format(gcsfile)
         logging.info(status)

    except ingest_flights.DataUnavailable:
         status = 'File for {}-{} not available yet ...'.format(year, month)
         logging.info(status)

    except KeyError as e:
         status = '<html>Sorry, this capability is accessible only by the Cron service, but I got a KeyError for {} -- try invoking it from <a href="{}"> the GCP console / AppEngine / taskqueues </a></html>'.format(e, 'http://console.cloud.google.com/appengine/taskqueues?tab=CRON')
         logging.info('Rejected non-Cron request')

```

## Scheduling a Cron Task

In [16]:
%cat cron.yaml

cron:
- description : ingest monthly flight data
  url : /ingest
  schedule: 8 of month 10:00
  timezone: US/Eastern
  target: flights


-----------------------

# Summary

-----------------------

# Code Break

# 참고자료
* [1] Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning - https://www.amazon.com/Data-Science-Google-Cloud-Platform/dp/1491974567
* [2] Book github - https://github.com/GoogleCloudPlatform/data-science-on-gcp