# Google Drive tool

This notebook walks through connecting a LangChain to the Google Drive API.

## Prerequisites

1. Create a Google Cloud project or use an existing project
1. Enable the [Google Drive API](https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com)
1. [Authorize credentials for desktop app](https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application)
1. `pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`

## Instructions for retrieving your Google Docs data
By default, the `GoogleDriveTools` and `GoogleDriveWrapper` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `GOOGLE_ACCOUNT_FILE` environment variable. 
The location of `token.json` use the same directory (or use the parameter `token_path`). Note that `token.json` will be created automatically the first time you use the tool.

`GoogleDriveSearchTool` can retrieve a selection of files with some requests. 

By default, If you use a `folder_id`, all the files inside this folder can be retrieved to `Document`, if the name match the query.


In [None]:
#!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib

You can obtain your folder and document id from the URL:
* Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 -> folder id is `"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5"`
* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw"`

The special value `root` is for your personal home.

In [1]:
folder_id = "root"
# folder_id='1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5'

By default, all files with these mime-type can be converted to `Document`.
- text/text
- text/plain
- text/html
- text/csv
- text/markdown
- image/png
- image/jpeg
- application/epub+zip
- application/pdf
- application/rtf
- application/vnd.google-apps.document (GDoc)
- application/vnd.google-apps.presentation (GSlide)
- application/vnd.google-apps.spreadsheet (GSheet)
- application/vnd.google.colaboratory (Notebook colab)
- application/vnd.openxmlformats-officedocument.presentationml.presentation (PPTX)
- application/vnd.openxmlformats-officedocument.wordprocessingml.document (DOCX)

It's possible to update or customize this. See the documentation of `GoogleDriveAPIWrapper`.

But, the corresponding packages must installed.

In [2]:
#!pip install unstructured

In [3]:
from langchain_googledrive.tools.google_drive.tool import GoogleDriveSearchTool
from langchain_googledrive.utilities.google_drive import GoogleDriveAPIWrapper

# By default, search only in the filename.
tool = GoogleDriveSearchTool(
    api_wrapper=GoogleDriveAPIWrapper(
        folder_id=folder_id,
        num_results=2,
        template="gdrive-query-in-folder",  # Search in the body of documents
    )
)

In [4]:
import logging

logging.basicConfig(level=logging.INFO)

In [5]:
tool.run("machine learning")

INFO:langchain_googledrive.tools.google_drive.tool:query='machine learning'
INFO:langchain_googledrive.utilities.google_drive:Yield 'Machine Learning sample 1'-0 with "Yann LeCun was born in Fr...es from patterns in data."


"[Machine Learning sample 1](https://docs.google.com/document/d/1RlvTGJZBy2jB1OCYGOTsfIfVScO-kQgUcLPE13UC9Fg/edit?usp=drivesdk)<br/>\nYann LeCun was born in France and grew up with an engineer father, developing an interest in electronics and mechanics. He earned a masters' degree from the École Supérieure d'Ingénieurs en Électrotechnique et Électronique and focused on microchip design and automation. Machine Learning is the application of algorithms and statistical models to analyse and draw inferences from patterns in data."

In [6]:
tool.description

"A wrapper around Google Drive Search. Useful for when you need to find a document in google drive. The input should be formatted as a list of entities separated with a space. As an example, a list of keywords is 'hello word'."

## Use within an Agent

In [9]:
pip install openai

Collecting openai
  Obtaining dependency information for openai from https://files.pythonhosted.org/packages/ae/59/911d6e5f1d7514d79c527067643376cddcf4cb8d1728e599b3b03ab51c69/openai-0.28.0-py3-none-any.whl.metadata
  Using cached openai-0.28.0-py3-none-any.whl.metadata (13 kB)
Using cached openai-0.28.0-py3-none-any.whl (76 kB)
Installing collected packages: openai
Successfully installed openai-0.28.0
Note: you may need to restart the kernel to use updated packages.


In [10]:
from langchain import OpenAI
from langchain.agents import AgentType, initialize_agent

llm = OpenAI(temperature=0)
agent = initialize_agent(
    tools=[tool],
    llm=llm,
    agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,
)

In [11]:
agent.run("Search in google drive, who is 'Yann LeCun' ?")

INFO:langchain_googledrive.tools.google_drive.tool:query='Yann LeCun'
INFO:langchain_googledrive.utilities.google_drive:Yield 'Machine Learning sample 1'-0 with "Yann LeCun was born in Fr...es from patterns in data."


"Yann LeCun is a French computer scientist and machine learning expert who earned a masters' degree from the École Supérieure d'Ingénieurs en Électrotechnique et Électronique and focused on microchip design and automation."