### 方法一
1. 把数据保存在Mysql数据库之前需要安装一个MYsql的驱动
    * pip install -i https://pypi.douban.com/simple mysqlclient
    * linux下安装
        * (ubuntu):sudo apt-get install libmysqlclient-devsimple
        * (cnetos):sudo yum install python-devl mysql-devl
2. 设置关于MYSQL的PIPELINE
```
import MySQLdb
class MysqlPipeline(object):
    # 采用同步的机制写入mysql
    def __init__(self):
        self.conn = MySQLdb.connect('localhost', 'root', '19980102', 'article_spider', charset="utf8", use_unicode=True)
        self.cursor = self.conn.cursor()

    def process_item(self, item, spider):
        insert_sql = """
            insert into article(title, url, create_date, fav_nums, url_object_id)
            VALUES (%s, %s, %s, %s, %s)
        """
        self.cursor.execute(insert_sql, (item["title"], item["url"], item["create_date"], item["fav_nums"], item["url_object_id"]))
        self.conn.commit()

        return item
```
3. 设置settings.py
```
ITEM_PIPELINES = {
    'ArticleSpider.pipelines.MysqlPipeline': 3
}
```

### 方法二(建议)
1. Scrapy的解析速度肯定是大于数据库的入库速度的，如果到后期爬取的item越来越多，数据入库的速度是远不如解析的速度的，所以说管道操作会堵塞。
2. Twisted这个框架给我们提供了一种将MYSQL插入异步化的一种操作(连接池)
    * 同步操作>self.cursor.execute(insert_sql, (item["title"], item["url"], item["create_date"], item["fav_nums"], item["url_object_id"]))
    * 同步操作意味着如果不执行完这步骤就执行不下去
3. 而且上面的那种PIPELINE类MYSQL配置是写死的，我们需要在settings.py文件里配置MYSQL
```
MYSQL_HOST = "localhost"
MYSQL_DBNAME = "article_spider"
MYSQL_USER = "root"
MYSQL_PASSWORD = "19980102"
```
4. 编写PIPELINE类
```
from twisted.enterprise import adbapi
import MySQLdb.cursors
class MysqlTwistedPipline(object):
    def __init__(self, dbpool):
        self.dbpool = dbpool

    @classmethod
    def from_settings(cls, settings):
        dbparms = dict(
            host=settings["MYSQL_HOST"],
            db=settings["MYSQL_DBNAME"],
            user=settings["MYSQL_USER"],
            passwd=settings["MYSQL_PASSWORD"],
            charset='utf8',
            cursorclass=MySQLdb.cursors.DictCursor,
            use_unicode=True,
        )
        dbpool = adbapi.ConnectionPool("MySQLdb", **dbparms)

        return cls(dbpool)

    def process_item(self, item, spider):
        # 使用twisted将mysql插入变成异步执行
        query = self.dbpool.runInteraction(self.do_insert, item)
        query.addErrback(self.handle_error, item, spider)  # 处理异常

    def handle_error(self, failure, item, spider):
        # 处理异步插入的异常
        print(failure)

    def do_insert(self, cursor, item):
    # 执行具体的插入
    # 根据不同的item 构建不同的sql语句并插入到mysql中
    insert_sql = """
                insert into article(title, url, create_date, fav_nums, url_object_id)
                VALUES (%s, %s, %s, %s, %s)
            """
    cursor.execute(insert_sql,
                        (item["title"], item["url"], item["create_date"], item["fav_nums"], item["url_object_id"]))
```
    * def from_settings(cls, settings)
    ```
    @classmethod
         def from_settings(cls, settings):
    ```
        * from_settings():方法名称固定,在自定义主键和扩展时,这个方法会被spider调用,它会将参数setting(即settings.py里的设置)传递进来并读取
    * dbpool = adbapi.ConnectionPool("MySQLdb", **dbparms)
        * 将mysqldb的操作编程异步化,第一个参数是dbapi的模块名,第二个参数是设置连接参数(可变参数,类型为dict)