## Simple Explanation: Setting Up Cohere Connectors for Your Walmart PDFs

Imagine Cohere's AI as a smart assistant that can answer questions based on your private files. A "Connector" is like a secure bridge that lets Cohere peek into your Google Drive (where your 2 Walmart PDFs are stored) to fetch relevant info for accurate, grounded responses. This is called RAG (Retrieval-Augmented Generation) – it pulls facts from your PDFs to make answers reliable and prevent "hallucinations" (made-up info).

We'll start from scratch: Get a Cohere account, set up Google permissions, create the connector, and test it with a query. No prior knowledge needed – I'll explain each term as we go. This is free to try with Cohere's trial API key, but for big PDFs, you might hit limits or need a paid plan.

By the end, you'll have code to ask: "What were Walmart's net sales in 2025?" and get an answer straight from your PDFs.

## Detailed Analysis: Step-by-Step Implementation

Let's break this down fully. I'll explain concepts without assuming you know anything (e.g., "API" means Application Programming Interface – a way for software to talk to Cohere). We'll use Python code (easy to run if you have basic setup like Google Colab). The goal is RAG: Cohere retrieves relevant chunks from your PDFs, then generates a response based only on them.

### Step 1: Prerequisites (Setup Basics)
- **Cohere Account**: Sign up for free at [dashboard.cohere.com](https://dashboard.cohere.com). Get your API key (a secret code) from the dashboard under "API Keys." It's like a password for Cohere's tools.
- **Google Drive Setup**: Upload your 2 Walmart PDFs to a Google Drive folder. Make a shareable link (right-click folder > Share > Get link > "Anyone with the link" for testing, but restrict for security later).
- **Google OAuth Credentials**: Cohere needs permission to access your Drive. This is where "client_id" and "client_secret" come in – they're like keys Google gives you.
  - Go to [console.cloud.google.com](https://console.cloud.google.com).
  - Create a new project (e.g., "Cohere-Walmart-PDFs").
  - Enable "Google Drive API" (search in Library).
  - Go to "Credentials" > Create OAuth client ID > Web application.
  - Add redirect URI (use Cohere's: often "https://dashboard.cohere.com/oauth/callback" – check Cohere docs).
  - Download the JSON file – it has your client_id and client_secret.
- **Python Setup**: Install libraries: Run `pip install cohere` in a terminal or notebook.

If anything's unclear, these are one-time setups. Now, let's code!

### Step 2: Create the Connector (Link Your Google Drive)
Use Cohere's API to register your Drive as a connector. This tells Cohere: "Hey, use these PDFs for answers."

```python
import cohere

# Your Cohere API key (from dashboard)
COHERE_API_KEY = "your-cohere-api-key-here"

# Initialize Cohere client
co = cohere.Client(COHERE_API_KEY)

# Create connector for Google Drive
connector_response = co.connectors.create(
    name="walmart-pdfs-connector",  # Name it something simple
    url="https://your-google-drive-folder-link",  # Shareable link to your folder with PDFs
    description="Connector for 2 Walmart PDFs in Google Drive",
    active=True,  # Activate it
    oauth={
        "client_id": "your-google-client-id",  # From Google JSON
        "client_secret": "your-google-client-secret",  # From Google JSON
        "authorization_url": "https://accounts.google.com/o/oauth2/auth",
        "token_url": "https://oauth2.googleapis.com/token",
        "scopes": ["https://www.googleapis.com/auth/drive.readonly"]  # Read-only access
    }
)

# Get the connector ID (your unique key)
connector_id = connector_response.connector.id
print("Your Connector ID:", connector_id)  # Save this!
```

- **What Happens?**: Cohere creates a secure link to your Drive. The ID is like a handle – use it later.
- **Troubleshooting**: If errors (e.g., invalid URL), check Drive link is shareable. OAuth setup might need tweaks – Google Console shows errors.

### Step 3: Test the Connector with a Query (Grounded Response)
Now, ask a question. Cohere will search your PDFs and base the answer on them.

```python
# Your query
query = "What were Walmart's net sales in fiscal 2025?"

# Use Cohere's chat API with your connector
response = co.chat(
    model="command-r-plus",  # Cohere's model (free tier works for testing)
    message=query,  # Your question
    connectors=[{"id": connector_id}]  # Use your new connector
)

# Print the grounded answer
print("Answer:", response.text)
# If citations are available, they show PDF sources
if response.citations:
    print("\nSources from PDFs:")
    for citation in response.citations:
        print(f"- {citation.text} (from {citation.document_ids})")
```

- **What Happens?**: Cohere searches your PDFs for "net sales," pulls relevant facts (e.g., "$674.5 billion" from the attachment), and generates a response. It's "grounded" – no made-up info.
- **Example Output**: "Based on the Walmart PDF, net sales for fiscal 2025 were $674.5 billion." (Actual from your files.)
- **RAG Explained**: Retrieval (searches PDFs) + Augmentation (adds facts) + Generation (creates answer). Your PDFs act as the "datasource" – Cohere indexes them automatically.

### Step 4: Enhancements and Tips (Advanced but Simple)
- **Add Streaming**: For real-time responses (like ChatGPT typing).
  ```python
  stream = co.chat_stream(
      model="command-r-plus",
      message=query,
      connectors=[{"id": connector_id}]
  )
  for event in stream:
      if event.event_type == "text-generation":
          print(event.text, end='')  # Streams answer
  ```
- **Handle Multiple PDFs**: Cohere auto-indexes all files in your Drive folder. No manual upload needed.
- **Costs/Limits**: Free trial has query limits. For production, monitor Cohere's pricing (per query/token).
- **Security**: Use private Drive links. Revoke access anytime in Google Console.
- **Testing**: Upload your PDFs, run the code. Query something specific like "Walmart's operating income" to see it pull from files.

This is a complete, from-scratch setup – simple code, no extras. If issues (e.g., API errors), share the error for tweaks!

[1] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/8247459/c5fb9c3f-c193-4fff-b235-4a458d359abe/US_Securities_And_Exchange_Commisson_Walmart.pdf
[2] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/8247459/5fc25a3b-65f2-4991-a86f-92cb961ee40c/walmart-inc-2025-annual-report.pdf

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()

COHERE_API_KEY = os.getenv("COHERE_API_KEY")
your_google_client_id = os.getenv("your-google-client-id")
your_google_client_secret = os.getenv("your-google-client-secret")
your_google_drive_folder_link = os.getenv("your-google-drive-folder-link")
your_google_drive_search_script_url = os.getenv("your-google-drive-search-script-url")
print(your_google_drive_search_script_url)

https://script.google.com/macros/s/AKfycbz_F-co55UJOfq2GOATviMtI4Jq2n6AFJjY_qocRJ4hcaafSIc9DX5-dHjmmzTkus4V/exec/search


In [None]:
import cohere

# Initialize Cohere client
co = cohere.Client(COHERE_API_KEY)



# Create connector with custom endpoint
connector_response = co.connectors.create(
    name="walmart-pdfs-connector",
    url=your_google_drive_search_script_url,  # Your Apps Script URL (Cohere adds /search or ?q internally)
    description="Connector for 2 Walmart PDFs via Apps Script",
    active=True
)

# Get the connector ID
connector_id = connector_response.connector.id
print("Your Connector ID:", connector_id)


BadRequestError: headers: {'content-type': 'application/json', 'date': 'Wed, 23 Jul 2025 17:23:39 GMT', 'content-length': '204', 'x-envoy-upstream-service-time': '157', 'server': 'envoy', 'via': '1.1 google', 'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000'}, status_code: 400, body: {'message': 'connector not reachable at https://script.google.com/macros/s/AKfycbz_F-co55UJOfq2GOATviMtI4Jq2n6AFJjY_qocRJ4hcaafSIc9DX5-dHjmmzTkus4V/exec/search. error: request failed with status code 401'}