<a href="https://colab.research.google.com/github/kerryback/data-portal-notebook/blob/main/Rice_Business_Data_Example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Rice Business Stock Market Data Portal

## Python API Tutorial

This notebook demonstrates how to access and analyze stock market data from the Rice Business Stock MarketData Portal using Python and DuckDB queries.

### üìã What You'll Learn

- How to connect to Rice Business stock market data
- SQL queries with DuckDB syntax
- Stock analysis examples
- Data visualization with pandas and matplotlib

### üîë Getting Your Access Token

1. Visit the [Rice Business Data Portal](https://data-portal.rice-business.org)
2. Enter your `@rice.edu` email address
3. Check your email for the **access token**
4. Use that access token in the code below

**Note**: The access token is the same token you use to log into the web portal!

### üîê **Recommended: Store Your Token as a Secret in Google Colab**

For security, we recommend storing your access token as a **secret** in Google Colab instead of pasting it directly in the code:

#### **How to Add a Secret:**
1. **Click the üîë key icon** on the left sidebar in Colab
2. **Click "Add new secret"**
3. **Name:** `RICE_ACCESS_TOKEN`
4. **Value:** Paste your access token from the email
5. **Enable notebook access** by toggling the switch

#### **Benefits:**
- ‚úÖ **Secure** - Token won't be visible in your notebook
- ‚úÖ **Shareable** - You can share notebooks without exposing your token
- ‚úÖ **Professional** - Best practice for sensitive credentials

If you don't set up a secret, you can still paste your token directly in the code below (look for `ACCESS_TOKEN = "YOUR_ACCESS_TOKEN_HERE"`).

---

### üöÄ Setup and Installation

First, let's install the required packages and download the Rice Data client.

In [None]:
# Install required packages
!pip install requests pandas matplotlib seaborn plotly -q

print("‚úÖ Packages installed successfully!")

In [None]:
# Download the Rice Data Python client
import urllib.request
import os

# Download from GitHub repository
client_url = "https://raw.githubusercontent.com/kerryback/data-portal-notebook/main/rice_data_client.py"

try:
    urllib.request.urlretrieve(client_url, 'rice_data_client.py')
    print("‚úÖ Rice Data client downloaded successfully!")
except Exception as e:
    print(f"‚ùå Download failed: {e}")
    print("üí° You can manually copy the client code from the Rice Business Data Portal")

In [None]:
# Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Set plotting style
plt.style.use('default')
sns.set_palette("husl")

print("‚úÖ Libraries imported successfully!")

### üîê Connect to Rice Business Data

**Replace `YOUR_ACCESS_TOKEN_HERE` with your actual access token from the Rice Business Data Portal.**

This is the same token you received via email and use to log into the web portal!

In [None]:
# Import the Rice Data client
from rice_data_client import RiceDataClient

# üîê OPTION 1: Use Google Colab Secret (Recommended)
try:
    from google.colab import userdata
    ACCESS_TOKEN = userdata.get('RICE_ACCESS_TOKEN')
    print("‚úÖ Using access token from Google Colab secrets")
except:
    # üîë OPTION 2: Paste your token directly (if you didn't set up a secret)
    ACCESS_TOKEN = "YOUR_ACCESS_TOKEN_HERE"  # Replace with your actual token
    if ACCESS_TOKEN == "YOUR_ACCESS_TOKEN_HERE":
        print("‚ö†Ô∏è  Please either:")
        print("   1. Set up a secret named 'RICE_ACCESS_TOKEN' in Colab (recommended), or")
        print("   2. Replace 'YOUR_ACCESS_TOKEN_HERE' with your actual token")

# üåê Portal URL - Rice Business Data Portal
PORTAL_URL = "https://data-portal.rice-business.org"

# Connect to Rice Business data
try:
    client = RiceDataClient(
        access_token=ACCESS_TOKEN,
        base_url=PORTAL_URL
    )
    print("\nüéâ Successfully connected to Rice Business Stock Market Data!")
except Exception as e:
    print(f"‚ùå Connection failed: {e}")
    print("üí° Make sure you have:")
    print("   1. A valid access token from the Rice Business Data Portal")
    print("   2. Internet connection to reach data-portal.rice-business.org")
    print("   3. Your token hasn't expired (tokens are valid for 48 hours)")

### üìä Explore Available Data

Let's start by exploring what data is available in our database.

In [None]:
# Get available tables
tables = client.get_available_tables()
print("üìã Available Data Tables:")
for i, table in enumerate(tables, 1):
    print(f"   {i}. {table}")

# Get database statistics
stats = client.get_stats()
print(f"\nüìà Database Overview:")
print(f"   ‚Ä¢ Total companies: {stats['total_tickers']:,}")
print(f"   ‚Ä¢ Active companies: {stats['active_tickers']:,}")
print(f"   ‚Ä¢ Top exchanges: {', '.join([ex['exchange'] for ex in stats['top_exchanges'][:3]])}")

In [None]:
# Explore sectors in the database
sectors = client.list_sectors()
print("üè¢ Top 10 Sectors by Number of Companies:")
print(sectors.head(10).to_string(index=False))

# Create a simple visualization
plt.figure(figsize=(12, 6))
top_sectors = sectors.head(8)
plt.bar(range(len(top_sectors)), top_sectors['ticker_count'], color='steelblue')
plt.xlabel('Sector')
plt.ylabel('Number of Companies')
plt.title('Top 8 Sectors by Number of Companies')
plt.xticks(range(len(top_sectors)), top_sectors['sector'], rotation=45, ha='right')
plt.tight_layout()
plt.show()

### üîç Basic DuckDB Queries

Now let's explore the data using SQL queries. DuckDB supports standard SQL syntax with some powerful extensions.

**üí° Note on Date Handling:** For maximum compatibility across different DuckDB environments (especially Google Colab), we use Python's `datetime` module to calculate dates and pass them as strings to SQL queries. This avoids issues with `INTERVAL` or `CURRENT_DATE` functions that may not be available in all environments.

In [None]:
# Example 1: Find technology companies
print("üíª Technology Companies (Sample):")
tech_companies = client.query("""
    SELECT ticker, name, industry, exchange, location
    FROM ndl.tickers 
    WHERE sector = 'Technology' 
    AND isdelisted = 'N'
    ORDER BY ticker
    LIMIT 10
""")

print(tech_companies.to_string(index=False))

In [None]:
# Summary: Your exploration summary
print("üéâ Rice Business Stock Market Data Analysis Complete!")
print("\nüìä What we covered:")
print("   ‚úÖ Connected to Rice Business data with access token")
print("   ‚úÖ Explored available data tables and statistics")
print("   ‚úÖ Performed basic DuckDB queries")
print("\nüí° Ready to dive deeper into financial analysis!")
print("\nüîó Don't forget to get your own access token from the Rice Business Data Portal")