# Scrapy를 사용한 웹 크롤링

Scrapy는 다수의 프로세스를 동시에 가동하여 크롤링 효율을 높이고 데이터 베이스에 기록까지 할 수 있는 웹 크롤링용 파이썬 패키지이다.

* http://doc.scrapy.org/en/latest/index.html


Scrapy를 이용한 웹 크롤링 어플리케이션은 다음과 같은 순서로 개발한다.

* Scrapy shell을 이용한 문서 구조 파악
* Scrapy 프로젝트 생성
* Spider 클래스 구현
* Item 클래스 구현
* Pipeline 구현
* Setting 설정


## Scrapy shell

scrapy shell은 콘솔에서 실행가능한 shell 도구이다. 크롤링하고자 하는 웹사이트 url을 인수로 가진다. 예를 들어 https://www.google.com/finance/historical?q=KRX%3AKOSPI200 페이지를 접근하려면 다음과 같이 실행한다.

```python
$ scrapy shell https://www.google.com/finance/historical?q=KRX%3AKOSPI200
```

실행하면 다음과 같은 페이지가 나타나며 ipython 콘솔이 실행된다.


```
2016-07-07 08:28:13 [scrapy] INFO: Scrapy 1.0.3 started (bot: scrapybot)
2016-07-07 08:28:13 [scrapy] INFO: Optional features available: ssl, http11, boto
2016-07-07 08:28:13 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0}
2016-07-07 08:28:13 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, CoreStats, SpiderState
2016-07-07 08:28:13 [boto] DEBUG: Retrieving credentials from metadata server.
2016-07-07 08:28:13 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2016-07-07 08:28:13 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2016-07-07 08:28:13 [scrapy] INFO: Enabled item pipelines:
2016-07-07 08:28:13 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2016-07-07 08:28:13 [scrapy] INFO: Spider opened
2016-07-07 08:28:14 [scrapy] DEBUG: Crawled (200) <GET https://www.google.com/finance/historical?q=KRX%3AKOSPI200> (referer: None)
[s] Available Scrapy objects:
[s]   crawler    <scrapy.crawler.Crawler object at 0x7fcd973f0d90>
[s]   item       {}
[s]   request    <GET https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
[s]   response   <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
[s]   settings   <scrapy.settings.Settings object at 0x7fcd94f80b90>
[s]   spider     <DefaultSpider 'default' at 0x7fcd8d9b2f50>
[s] Useful shortcuts:
[s]   shelp()           Shell help (print this help)
[s]   fetch(req_or_url) Fetch request (or URL) and update local objects
[s]   view(response)    View response in a browser
2016-07-07 08:28:14 [root] DEBUG: Using default logger
2016-07-07 08:28:14 [root] DEBUG: Using default logger

In [1]:

```

이 ipython 콘솔은 다음과 같은 객체들을 이미 생성해 놓은 상태이다. 웹서버의 응답 즉, 웹페이지 내용은 `response` 객체에 저장되어 있다.

* `crawler`
* `request`
* `response`



```
In [1]: type(response)
Out[1]: scrapy.http.response.html.HtmlResponse

In [2]: response.url
Out[2]: 'https://www.google.com/finance/historical?q=KRX%3AKOSPI200'

In [3]: response.body[:100]
Out[3]: '<!DOCTYPE html><html><head><script>(function(){(function(){function e(a){this.t={};this.tick=functio'

```

 `response` 객체 즉, `scrapy.http.response.html.HtmlResponse` 클래스는 HTML 파싱을 위한 `xpath` 등의 메서드를 제공한다. 이를 이용하면 원하는 html 요소를 선택할 수 있다.
 
 * http://doc.scrapy.org/en/latest/topics/request-response.html?#response-objects

```
In [4]: response.xpath('//td[@class="lm"]')
Out[4]:
[<Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jul 6, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jul 5, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jul 4, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jul 1, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 30, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 29, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 28, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 27, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 24, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 23, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 22, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 21, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 20, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 17, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 16, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 15, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 14, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 13, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 10, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 9, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 8, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 7, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 3, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 2, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">Jun 1, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">May 31, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">May 30, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">May 27, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">May 26, 2016\n</td>'>,
 <Selector xpath='//td[@class="lm"]' data=u'<td class="lm">May 25, 2016\n</td>'>]

```

## Scrapy 프로젝트 생성

scrapy shell 을 사용하여 원하는 요소에 대한 조사가 끝나면 실제로 scrapy 를 구현해야 한다. 첫번재 단계로 프로젝트를 생성한다.

```
scrapy startproject tutorial
```

이 `tutorial` 프로젝트를 담고 있는 다음과 같이 디렉토리가 생성된다

```
tutorial/
    scrapy.cfg            # deploy configuration file
    tutorial/             # project's Python module, you'll import your code from here
        __init__.py
        items.py          # project items file
        pipelines.py      # project pipelines file
        settings.py       # project settings file
        spiders/          # a directory where you'll later put your spiders
            __init__.py
```


## Spider 클래스 구현

`spiders` 디렉토리 아래에는 실제로 웹 페이지를 읽고 데이터를 반환하는 클래스를 구현한다.


```python

# __init__.py

from dailystock import *

```

```python
# dailystock.py

import scrapy
import numpy as np
from dateutil.parser import parse

class DailyStockSpider(scrapy.Spider):
    name = "dailystock"
    start_urls = ["https://www.google.com/finance/historical?q=KRX%3AKOSPI200"]

    def parse(self, response):
        dates = [parse(x.extract().strip()) for x in response.xpath('//td[@class="lm"]/text()')]
        volumes = np.array([int(x.extract().strip().replace(',','')) for x in response.xpath('//td[@class="rgt rm"]/text()')])
        prices = np.reshape([float(x.extract().strip()) for x in response.xpath('//td[@class="rgt"]/text()')], (-1, 4))
        for d, v, p in zip(dates, volumes, prices):
          symbol = "KOSPI"
          date = d
          price_open = p[0]
          price_high = p[1]
          price_low = p[2]
          price_close = p[3]
          volume = v
          yield {"symbol": symbol, "date": date, 
                 "price_open": price_open, "price_high": price_high, 
                 "price_low": price_low, "price_close": price_close, 
                 "volume": volume}
```

일단 spider가 구현되면 다음과 같이 크롤링을 할 수 있다. 이 명령은 프로젝트 디렉토리 아래에서 실행해야 한다.

```
scrapy crawl dailystock -o data.json
```

## Item 클래스 구현

`items.py` 파일내에는 데이터베이스 레코드를 구현한다.

```
import scrapy

class DailyStockItem(scrapy.Item):
    symbol = scrapy.Field()
    date = scrapy.Field()
    price_open = scrapy.Field()
    price_high = scrapy.Field()
    price_low = scrapy.Field()
    price_close = scrapy.Field()
    volume = scrapy.Field()
```

## Pipeline 구현

pipeline은 수집한 데이터를 파일이 아닌 데이터베이스에 직접 넣기 위한 것이다. `pipelines.py` 파일에 구현한다. 보통 생성자에서 데이터베이스 연결을 만들고 `process_item` 메서드에서 레코드 입력 및 커밋(commit)을 한다. 

여기에서는 sqlite 데이터베이스를 사용하였다.


```python
import sqlite3
import os


class DailyStockPipeline(object):
    filename = 'dailystock.sqlite'

    def __init__(self):
        self.conn = None
        if os.path.exists(self.filename):
            self.conn = sqlite3.connect(self.filename)
        else:
            self.conn = sqlite3.connect(self.filename)
            self.conn.execute("""create table dailystock
                (symbol TEXT NOT NULL,
                 date TIMESTAMP NOT NULL,
                 price_open REAL,
                 price_high REAL,
                 price_low REAL,
                 price_close REAL,
                 volume INTEGER,
                 PRIMARY KEY (symbol, date))""")
            self.conn.commit()

    def process_item(self, item, domain):
        try:
            self.conn.execute('insert into dailystock values(?,?,?,?,?,?,?)',
                (item['symbol'], item['date'],
                 item['price_open'], item['price_high'],
                 item['price_low'], item['price_close'],
                 item['volume']))
            self.conn.commit()
        except Exception, e:
            print str(e)
        return item
```

## Setting 설정

이 pipeline을 사용하기 위해서는 settings.py 파일에 다음과 같이 설정을 추가해야 한다.

```python
BOT_NAME = 'tutorial'
SPIDER_MODULES = ['tutorial.spiders']
NEWSPIDER_MODULE = 'tutorial.spiders'
ITEM_PIPELINES = {
    'tutorial.pipelines.DailyStockPipeline': 300,
}
DOWNLOAD_HANDLERS = {
    's3': None,
}
```









## 크롤링 

실제로 크롤링을 하려면 tutorial 프로젝트 디렉토리에서 다음과 같이 명령한다.

```
$ scrapy crawl dailystock
2016-07-08 03:40:40 [scrapy] INFO: Scrapy 1.0.3 started (bot: tutorial)
2016-07-08 03:40:40 [scrapy] INFO: Optional features available: ssl, http11, boto
2016-07-08 03:40:40 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'tutorial.spiders', 'SPIDER_MODULES': ['tutorial.spiders'], 'BOT_NAME': 'tutorial'}
2016-07-08 03:40:40 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, LogStats, CoreStats, SpiderState
2016-07-08 03:40:40 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2016-07-08 03:40:40 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2016-07-08 03:40:40 [scrapy] INFO: Enabled item pipelines: DailyStockPipeline
2016-07-08 03:40:40 [scrapy] INFO: Spider opened
2016-07-08 03:40:40 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-07-08 03:40:40 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2016-07-08 03:40:40 [scrapy] DEBUG: Crawled (200) <GET https://www.google.com/finance/historical?q=KRX%3AKOSPI200> (referer: None)
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 244.59999999999999, 'volume': 62210000, 'price_open': 243.15000000000001, 'price_low': 242.63, 'date': datetime.datetime(2016, 7, 7, 0, 0), 'price_high': 245.02000000000001}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 241.86000000000001, 'volume': 71672000, 'price_open': 245.37, 'price_low': 240.72999999999999, 'date': datetime.datetime(2016, 7, 6, 0, 0), 'price_high': 245.74000000000001}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 246.91, 'volume': 54810000, 'price_open': 247.66999999999999, 'price_low': 246.50999999999999, 'date': datetime.datetime(2016, 7, 5, 0, 0), 'price_high': 247.84}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 247.62, 'volume': 63633000, 'price_open': 246.66, 'price_low': 246.16, 'date': datetime.datetime(2016, 7, 4, 0, 0), 'price_high': 247.94}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 246.52000000000001, 'volume': 63045000, 'price_open': 244.96000000000001, 'price_low': 244.78, 'date': datetime.datetime(2016, 7, 1, 0, 0), 'price_high': 247.56999999999999}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 244.13999999999999, 'volume': 75453000, 'price_open': 244.25, 'price_low': 242.52000000000001, 'date': datetime.datetime(2016, 6, 30, 0, 0), 'price_high': 244.47999999999999}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 242.31999999999999, 'volume': 67355000, 'price_open': 241.28999999999999, 'price_low': 240.68000000000001, 'date': datetime.datetime(2016, 6, 29, 0, 0), 'price_high': 243.71000000000001}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 240.08000000000001, 'volume': 79648000, 'price_open': 236.78, 'price_low': 236.72999999999999, 'date': datetime.datetime(2016, 6, 28, 0, 0), 'price_high': 240.46000000000001}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 239.28, 'volume': 95450000, 'price_open': 236.78999999999999, 'price_low': 236.68000000000001, 'date': datetime.datetime(2016, 6, 27, 0, 0), 'price_high': 239.28}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 239.21000000000001, 'volume': 190032000, 'price_open': 248.22999999999999, 'price_low': 234.97, 'date': datetime.datetime(2016, 6, 24, 0, 0), 'price_high': 248.27000000000001}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 246.31, 'volume': 65983000, 'price_open': 246.41999999999999, 'price_low': 245.63999999999999, 'date': datetime.datetime(2016, 6, 23, 0, 0), 'price_high': 246.81}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 246.75, 'volume': 65848000, 'price_open': 245.16999999999999, 'price_low': 244.78999999999999, 'date': datetime.datetime(2016, 6, 22, 0, 0), 'price_high': 247.03}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 245.34, 'volume': 58402000, 'price_open': 244.59999999999999, 'price_low': 243.99000000000001, 'date': datetime.datetime(2016, 6, 21, 0, 0), 'price_high': 245.5}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 245.16999999999999, 'volume': 75085000, 'price_open': 244.44, 'price_low': 243.74000000000001, 'date': datetime.datetime(2016, 6, 20, 0, 0), 'price_high': 245.63}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 241.63, 'volume': 77334000, 'price_open': 243.34999999999999, 'price_low': 241.31999999999999, 'date': datetime.datetime(2016, 6, 17, 0, 0), 'price_high': 244.02000000000001}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 241.61000000000001, 'volume': 83308000, 'price_open': 243.66, 'price_low': 240.52000000000001, 'date': datetime.datetime(2016, 6, 16, 0, 0), 'price_high': 244.0}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 243.30000000000001, 'volume': 84095000, 'price_open': 243.41, 'price_low': 242.19, 'date': datetime.datetime(2016, 6, 15, 0, 0), 'price_high': 244.16}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 243.34999999999999, 'volume': 107840000, 'price_open': 243.78999999999999, 'price_low': 242.36000000000001, 'date': datetime.datetime(2016, 6, 14, 0, 0), 'price_high': 244.46000000000001}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 244.05000000000001, 'volume': 79114000, 'price_open': 246.74000000000001, 'price_low': 243.69999999999999, 'date': datetime.datetime(2016, 6, 13, 0, 0), 'price_high': 246.96000000000001}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 248.96000000000001, 'volume': 89452000, 'price_open': 249.86000000000001, 'price_low': 248.63, 'date': datetime.datetime(2016, 6, 10, 0, 0), 'price_high': 249.86000000000001}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 250.19, 'volume': 138770000, 'price_open': 250.19, 'price_low': 248.46000000000001, 'date': datetime.datetime(2016, 6, 9, 0, 0), 'price_high': 251.50999999999999}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 250.03999999999999, 'volume': 90733000, 'price_open': 248.24000000000001, 'price_low': 247.68000000000001, 'date': datetime.datetime(2016, 6, 8, 0, 0), 'price_high': 250.03999999999999}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 247.84999999999999, 'volume': 79435000, 'price_open': 245.55000000000001, 'price_low': 245.53999999999999, 'date': datetime.datetime(2016, 6, 7, 0, 0), 'price_high': 247.86000000000001}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 244.38999999999999, 'volume': 86345000, 'price_open': 244.94, 'price_low': 243.66999999999999, 'date': datetime.datetime(2016, 6, 3, 0, 0), 'price_high': 244.94}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 244.18000000000001, 'volume': 78953000, 'price_open': 243.88, 'price_low': 243.31, 'date': datetime.datetime(2016, 6, 2, 0, 0), 'price_high': 244.66}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 243.58000000000001, 'volume': 78835000, 'price_open': 242.69999999999999, 'price_low': 242.47999999999999, 'date': datetime.datetime(2016, 6, 1, 0, 0), 'price_high': 244.19}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 243.63, 'volume': 117966000, 'price_open': 241.03, 'price_low': 240.36000000000001, 'date': datetime.datetime(2016, 5, 31, 0, 0), 'price_high': 243.84}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 241.72999999999999, 'volume': 67446000, 'price_open': 241.94, 'price_low': 240.41, 'date': datetime.datetime(2016, 5, 30, 0, 0), 'price_high': 242.06999999999999}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 241.84999999999999, 'volume': 92225000, 'price_open': 241.25, 'price_low': 240.74000000000001, 'date': datetime.datetime(2016, 5, 27, 0, 0), 'price_high': 242.41}
2016-07-08 03:40:40 [scrapy] DEBUG: Scraped from <200 https://www.google.com/finance/historical?q=KRX%3AKOSPI200>
{'symbol': 'KOSPI', 'price_close': 240.58000000000001, 'volume': 130282000, 'price_open': 241.56, 'price_low': 240.44999999999999, 'date': datetime.datetime(2016, 5, 26, 0, 0), 'price_high': 242.05000000000001}
2016-07-08 03:40:40 [scrapy] INFO: Closing spider (finished)
2016-07-08 03:40:40 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 247,
 'downloader/request_count': 1,
 'downloader/request_method_count/GET': 1,
 'downloader/response_bytes': 8127,
 'downloader/response_count': 1,
 'downloader/response_status_count/200': 1,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2016, 7, 8, 3, 40, 40, 723470),
 'item_scraped_count': 30,
 'log_count/DEBUG': 32,
 'log_count/INFO': 7,
 'response_received_count': 1,
 'scheduler/dequeued': 1,
 'scheduler/dequeued/memory': 1,
 'scheduler/enqueued': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2016, 7, 8, 3, 40, 40, 271766)}
2016-07-08 03:40:40 [scrapy] INFO: Spider closed (finished)
```

크롤링이 완료되면 다음과 같이 sqlite 데이터베이스를 확인할 수 있다.

```
$ sqlite3 dailystock.sqlite 'select * from dailystock'
KOSPI|2016-07-07 00:00:00|243.15|245.02|242.63|244.6|62210000
KOSPI|2016-07-06 00:00:00|245.37|245.74|240.73|241.86|71672000
KOSPI|2016-07-05 00:00:00|247.67|247.84|246.51|246.91|54810000
KOSPI|2016-07-04 00:00:00|246.66|247.94|246.16|247.62|63633000
KOSPI|2016-07-01 00:00:00|244.96|247.57|244.78|246.52|63045000
KOSPI|2016-06-30 00:00:00|244.25|244.48|242.52|244.14|75453000
KOSPI|2016-06-29 00:00:00|241.29|243.71|240.68|242.32|67355000
KOSPI|2016-06-28 00:00:00|236.78|240.46|236.73|240.08|79648000
KOSPI|2016-06-27 00:00:00|236.79|239.28|236.68|239.28|95450000
KOSPI|2016-06-24 00:00:00|248.23|248.27|234.97|239.21|190032000
KOSPI|2016-06-23 00:00:00|246.42|246.81|245.64|246.31|65983000
KOSPI|2016-06-22 00:00:00|245.17|247.03|244.79|246.75|65848000
KOSPI|2016-06-21 00:00:00|244.6|245.5|243.99|245.34|58402000
KOSPI|2016-06-20 00:00:00|244.44|245.63|243.74|245.17|75085000
KOSPI|2016-06-17 00:00:00|243.35|244.02|241.32|241.63|77334000
KOSPI|2016-06-16 00:00:00|243.66|244.0|240.52|241.61|83308000
KOSPI|2016-06-15 00:00:00|243.41|244.16|242.19|243.3|84095000
KOSPI|2016-06-14 00:00:00|243.79|244.46|242.36|243.35|107840000
KOSPI|2016-06-13 00:00:00|246.74|246.96|243.7|244.05|79114000
KOSPI|2016-06-10 00:00:00|249.86|249.86|248.63|248.96|89452000
KOSPI|2016-06-09 00:00:00|250.19|251.51|248.46|250.19|138770000
KOSPI|2016-06-08 00:00:00|248.24|250.04|247.68|250.04|90733000
KOSPI|2016-06-07 00:00:00|245.55|247.86|245.54|247.85|79435000
KOSPI|2016-06-03 00:00:00|244.94|244.94|243.67|244.39|86345000
KOSPI|2016-06-02 00:00:00|243.88|244.66|243.31|244.18|78953000
KOSPI|2016-06-01 00:00:00|242.7|244.19|242.48|243.58|78835000
KOSPI|2016-05-31 00:00:00|241.03|243.84|240.36|243.63|117966000
KOSPI|2016-05-30 00:00:00|241.94|242.07|240.41|241.73|67446000
KOSPI|2016-05-27 00:00:00|241.25|242.41|240.74|241.85|92225000
KOSPI|2016-05-26 00:00:00|241.56|242.05|240.45|240.58|130282000
```