<a href="https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/project_gutenberg.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project Gutenberg Retriever

In this tutorial, we will demonstrate how to use the `ProjectGutenbergRetriever` to retrieve books from Project Gutenberg. We will cover the following steps:

1. Retrieving the book "Crime and Punishment".
2. Attempting to retrieve a non-existent book "NO_BOOK_EXIST_WITH_THIS_NAME".
3. Retrieving books asynchronously.


## Installation and Setup

First, let's install the necessary libraries. Make sure you have `requests` installed.

In [1]:
!pip install requests



Next, we will import our `ProjectGutenbergRetriever` class. The implementation details are abstracted away in this tutorial to keep it user-friendly.

In [6]:
from langchain_community.retrievers import ProjectGutenbergRetriever

In [None]:
# Initialize the retriever
retriever = ProjectGutenbergRetriever()

## 1. Retrieving the book "Crime and Punishment".

Next, let's retrieve the book "Crime and Punishment".

In [7]:
# Initialize the retriever

# Retrieve books with 'Love' in the title
love_books = retriever.invoke("Crime and Punishment")
for book in love_books:
    print(book.metadata["title"])

Crime and Punishment


## 2. Attempting to Retrieve a Non-existent Book

Let's see what happens when we attempt to retrieve a non-existent book "NO_BOOK_EXIST_WITH_THIS_NAME".

In [10]:
non_existent_book = retriever.invoke("NO_BOOK_EXIST_WITH_THIS_NAME")
non_existent_book

[]

## 3. Retrieving Books Asynchronously

Finally, let's retrieve books asynchronously. The asynchronous functionality works similarly to the synchronous one, providing the same results but in a non-blocking manner.

In [None]:
import asyncio


# Define an asynchronous function to retrieve books
async def retrieve_books_async(query):
    retriever = ProjectGutenbergRetriever()
    return await retriever.ainvoke(query)


# Retrieve books asynchronously
async def main():
    books_async = await retrieve_books_async("Pride and Prejudice")
    for book in books_async:
        print(book.metadata["title"])


# Run the asynchronous retrieval
await main()

This tutorial demonstrated how to use the `ProjectGutenbergRetriever` to retrieve books from Project Gutenberg, including retrieving specific books, handling non-existent books, and using asynchronous retrieval.