* Validate Mongodb Database Setup
* Overview of Scrapy Pipelines
* Overview of using hash code for quote text
* Update Spider Logic to include hash code
* Develop Pipeline Logic to write to Mongodb
* Run the Pipeline to write to Mongodb
* Validate Data in Mongodb Collection
* Exercise and Solution

* Validate Mongodb Database Setup

1. Make sure Mongodb is running (Use telnet to validate - `telnet localhost 27017`)
2. Launch Mongo shell using `mongosh`.
3. We can also use `pymongo` to connect to Mongodb Database using Python.

```python
import pymongo
client = pymongo.MongoClient('localhost', 27017)

for db in client.list_databases():
    print(db['name'])

# We can create new database and then use relevant APIs to deal with collections and documents
db = client['quotes_db']

# If the database is empty, you will not see any collections
for collection in db.list_collections():
    print(collection)
```

* Overview of Scrapy Pipelines

* Overview of using hash code for quote text

```python
import hashlib
sha = hashlib.sha256()

s = 'Hello World'
sha.update(s.encode())

sha.hexdigest()
```

* Update Spider Logic to include hash code

```python
import hashlib
import scrapy


def generate_urls(base_url):
    urls = []
    for i in range(1, 101):
        urls.append(f'{base_url}?page={i}')
    return urls

    
class QuoteSpider(scrapy.Spider):
    name = 'quotes'
    start_urls = generate_urls('https://www.goodreads.com/quotes')

    def parse(self, response):
        sha = hashlib.sha256()
        for quoteDetails in response.css('.quoteDetails'):
            quote_text = quoteDetails.css('.quoteText::text').get()
            sha.update(quote_text.encode())
            payload = {
                'quoteTextHash': sha.hexdigest(),
                'quoteText': quote_text,
                'authorOrTitle': quoteDetails.css('span.authorOrTitle::text').get(),
                'authorOrTitleUrl': quoteDetails.css('a.authorOrTitle::attr(href)').get(),
                'authorOrTitleText': quoteDetails.css('a.authorOrTitle::text').get()
            }
            yield payload
```

* Develop Pipeline Logic to write to Mongodb

```python

```