# Daily Portfolio Data Loader

This notebook automates the daily loading of portfolio data from Interactive Brokers into the database.

## Overview

This notebook covers:
- Connecting to Interactive Brokers Gateway
- Extracting portfolio positions by account
- Data validation and cleaning
- Security master synchronization
- Database storage of portfolio holdings

## Prerequisites

Before running this notebook, ensure you have:
- Interactive Brokers Gateway running on localhost:4001
- Required Python packages installed
- Database credentials stored in keyring
- Appropriate database schema and functions
- Custom broker modules available

## Setup and Imports

In [1]:
import sys
import os
from datetime import datetime
import pandas as pd

# Get the current notebook's directory and go up to parent
current_dir = os.getcwd()
parent_dir = os.path.dirname(os.path.dirname(current_dir))

if parent_dir not in sys.path:
    sys.path.append(parent_dir)

print(f"=== Portfolio Data Loader Started ===")
print(f"Current directory: {current_dir}")
print(f"Added to path: {parent_dir}")

# Verify the required folders exist
brokers_path = os.path.join(parent_dir, 'brokers')
data_eng_path = os.path.join(parent_dir, 'data_engineering')

print(f"Brokers folder exists: {os.path.exists(brokers_path)}")
print(f"Data engineering folder exists: {os.path.exists(data_eng_path)}")

=== Portfolio Data Loader Started ===
Current directory: c:\Users\menon\OneDrive\Documents\SourceCode\InvestmentManagement\toolkit\notebooks
Added to path: c:\Users\menon\OneDrive\Documents\SourceCode\InvestmentManagement
Brokers folder exists: True
Data engineering folder exists: True


In [2]:
# Import required modules
from brokers.interactive_broker import *
from data_engineering.database import db_functions as database

print("Required modules imported successfully")

Required modules imported successfully


## Trading Day Information

In [3]:
# Display current trading day
trading_day = DateUtils.get_trading_day()
print(f"Current trading day: {trading_day}")
print(f"Timestamp: {datetime.now()}")

Current trading day: 2025-09-05
Timestamp: 2025-09-07 21:17:22.944460


## Interactive Brokers Connection

### Establishing Connection

**Note**: Ensure your IB Gateway is running and configured to accept API connections on port 4001.

In [4]:
print("\n=== Connecting to IB Gateway ===")

# Initialize Interactive Broker and Security Master Manager
ib = InteractiveBroker()


# Connect to IB Gateway
if not ib.connect():
    print("‚ùå Failed to connect to IB Gateway. Please check:")
    print("   - IB Gateway is running")
    print("   - API connections are enabled")
    print("   - Port 4001 is available")
    raise Exception("IB Gateway connection failed")
else:
    print("‚úÖ Successfully connected to IB Gateway")


=== Connecting to IB Gateway ===
Attempting to connect to IB Gateway...
Connected to IB Gateway: True
Connection successful!
Retrieved 119 account summary items
‚úÖ Successfully connected to IB Gateway


## Portfolio Data Extraction

### Retrieving Account Positions

In [5]:
print("\n=== Retrieving Interactive Broker Data ===")

# Get positions for specific account
account_id = 'U20761295'
print(f"Retrieving positions for account: {account_id}")

ib_account_positions = ib.get_positions_by_account(account_id)

if ib_account_positions.empty:
    print("‚ùå No portfolio data retrieved")
    print("This could indicate:")
    print("   - Account has no positions")
    print("   - Account ID is incorrect")
    print("   - Connection issues")
    raise Exception("No portfolio data available")
else:
    print(f"‚úÖ Retrieved {len(ib_account_positions)} positions")
    print("\nFirst few records:")
    display(ib_account_positions.head())


=== Retrieving Interactive Broker Data ===
Retrieving positions for account: U20761295
Retrieved 4 positions for account U20761295
   as_of_date portfolio_short_name symbol ib_security_type ib_exchange  \
0  2025-09-05            U20761295   IONQ              STK        NYSE   
1  2025-09-05            U20761295     MU              STK      NASDAQ   
2  2025-09-05            U20761295   UBER              STK        NYSE   
3  2025-09-05            U20761295   EMBC              STK      NASDAQ   

  ib_currency  held_shares   avg_cost  
0         USD          2.0   40.97410  
1         USD          1.0  124.86930  
2         USD          2.0   93.00905  
3         USD          1.0   10.85750  
‚úÖ Retrieved 4 positions

First few records:


Unnamed: 0,as_of_date,portfolio_short_name,symbol,ib_security_type,ib_exchange,ib_currency,held_shares,avg_cost
0,2025-09-05,U20761295,IONQ,STK,NYSE,USD,2.0,40.9741
1,2025-09-05,U20761295,MU,STK,NASDAQ,USD,1.0,124.8693
2,2025-09-05,U20761295,UBER,STK,NYSE,USD,2.0,93.00905
3,2025-09-05,U20761295,EMBC,STK,NASDAQ,USD,1.0,10.8575


## Data Validation and Cleaning

### Data Quality Checks

In [6]:
print("\n=== Data Validation ===")

# Validate the data
is_valid, issues = IBDataValidator.validate_ib_data(ib_account_positions)

if not is_valid:
    print("‚ö†Ô∏è  Data validation issues found:")
    for issue in issues:
        print(f"   - {issue}")
    print("\nProceeding with data cleaning...")
else:
    print("‚úÖ Data validation passed - no issues found")

# Clean the data
print("\n=== Data Cleaning ===")
original_count = len(ib_account_positions)
ib_account_positions = IBDataValidator.clean_ib_data(ib_account_positions)
cleaned_count = len(ib_account_positions)

print(f"Records before cleaning: {original_count}")
print(f"Records after cleaning: {cleaned_count}")
if original_count != cleaned_count:
    print(f"Removed {original_count - cleaned_count} invalid records")


=== Data Validation ===
‚úÖ Data validation passed - no issues found

=== Data Cleaning ===
Cleaned portfolio data: 4 records
Records before cleaning: 4
Records after cleaning: 4


## Database Operations

### Database Connection

In [7]:
print("\n=== Database Connection ===")

# Establish database connection
try:
    engine, connection,conn_str, session = database.get_db_connection()
    print("‚úÖ Database connection established")
except Exception as e:
    print(f"‚ùå Database connection failed: {e}")
    raise


=== Database Connection ===
Database connection successful.
‚úÖ Database connection established


### Security Master Data Retrieval

In [8]:
print("\n=== Security Master Data Retrieval ===")

# Read security master data
security_master = database.read_security_master(session, engine)
print(f"Retrieved {len(security_master)} securities from master data")

# Merge positions with security master
merged_securities = pd.merge(
    ib_account_positions,
    security_master, 
    on='symbol', 
    how='left'
)

print(f"Merged {len(merged_securities)} records")
print("\nMerged data structure:")
display(merged_securities.head())


=== Security Master Data Retrieval ===
Retrieved 159115 securities from master data
Merged 4 records

Merged data structure:


Unnamed: 0,as_of_date,portfolio_short_name,symbol,ib_security_type,ib_exchange,ib_currency,held_shares,avg_cost,security_id,name,...,sector,industry_group,industry,security_type,asset_class,exchange,is_active,source_vendor,upsert_date,upsert_by
0,2025-09-05,U20761295,IONQ,STK,NYSE,USD,2.0,40.9741,80174,IonQ Inc. Common Stock,...,Industrials,Capital Goods,Building Products,Common Stock,Equity,ASE,1,FinanceDatabase,2025-06-13 17:10:30,mf
1,2025-09-05,U20761295,MU,STK,NASDAQ,USD,1.0,124.8693,99102,"Micron Technology, Inc.",...,Information Technology,Semiconductors & Semiconductor Equipment,Semiconductors & Semiconductor Equipment,Common Stock,Equity,NMS,1,FinanceDatabase,2025-06-13 17:10:30,mf
2,2025-09-05,U20761295,UBER,STK,NYSE,USD,2.0,93.00905,159128,UBER,...,,,,Common Stock,Equity,XNYS,1,IB,2025-08-17 21:53:09,ib_portfolio_loader
3,2025-09-05,U20761295,EMBC,STK,NASDAQ,USD,1.0,10.8575,61581,Embecta Corp. Common Stock,...,Health Care,Health Care Equipment & Services,Health Care Technology,Common Stock,Equity,NMS,0,FinanceDatabase,2025-06-13 17:10:30,mf


### Missing Securities Management

In [10]:
sm_manager = SecurityMasterManager()
print("\n=== Missing Securities Check ===")


# Identify missing securities
missing_tickers = merged_securities[merged_securities['security_id'].isna()].copy()

if not missing_tickers.empty:
    print(f"‚ö†Ô∏è  Found {len(missing_tickers)} missing securities:")
    print(missing_tickers[['symbol', 'security_type', 'exchange']].to_string())
    
    # Insert missing securities
    print("\n=== Inserting Missing Securities ===")
    if sm_manager.insert_missing_securities(missing_tickers):
        print("‚úÖ Missing securities inserted successfully")
        
        # Re-read security master data to include newly inserted records
        security_master = database.read_security_master(session, engine)
        print(f"Updated security master now contains {len(security_master)} records")
        
        # Re-merge the data with updated securities table
        merged_securities = pd.merge(
            ib_account_positions,
            security_master, 
            on='symbol', 
            how='left'
        )
        print("‚úÖ Re-merged portfolio data with updated SecurityMaster")
    else:
        print("‚ùå Failed to insert missing securities")
else:
    print("‚úÖ All securities found in master data - no insertions needed")

Database connection successful.

=== Missing Securities Check ===
‚úÖ All securities found in master data - no insertions needed


### Final Data Verification

In [11]:
print("\n=== Final Data Verification ===")

# Final verification
remaining_missing = merged_securities[merged_securities['security_id'].isna()]

if not remaining_missing.empty:
    print(f"‚ö†Ô∏è  Warning: {len(remaining_missing)} records still have missing security data:")
    display(remaining_missing[['symbol', 'security_type', 'exchange']])
    
    # Option to continue or stop
    print("\n‚ùì Do you want to continue with incomplete data? (Manual intervention may be required)")
else:
    print("‚úÖ All positions have corresponding SecurityMaster records")

# Display data quality metrics
total_positions = len(merged_securities)
valid_positions = len(merged_securities[merged_securities['security_id'].notna()])
completion_rate = (valid_positions / total_positions) * 100 if total_positions > 0 else 0

print(f"\nüìä Data Quality Summary:")
print(f"   Total positions: {total_positions}")
print(f"   Valid positions: {valid_positions}")
print(f"   Completion rate: {completion_rate:.1f}%")


=== Final Data Verification ===
‚úÖ All positions have corresponding SecurityMaster records

üìä Data Quality Summary:
   Total positions: 4
   Valid positions: 4
   Completion rate: 100.0%


## Portfolio Data Preparation

### Portfolio Reference Data

In [15]:
print("\n=== Portfolio Reference Data ===")

# Get unique portfolio short names
portfolio_names = merged_securities['portfolio_short_name'].unique().tolist()
print(f"Portfolio names found: {portfolio_names}")

# Read portfolio reference data
df_portfolio_data = database.read_portfolio(
    session, 
    engine, 
    portfolio_names
)

print(f"Retrieved {len(df_portfolio_data)} portfolio records")
if not df_portfolio_data.empty:
    print("\nPortfolio reference data:")
    display(df_portfolio_data.head())


=== Portfolio Reference Data ===
Portfolio names found: ['U20761295']
Retrieved 1 portfolio records

Portfolio reference data:


Unnamed: 0,port_id,portfolio_short_name,portfolio_name,portfolio_type
0,4,U20761295,client,Portfolio


### Final Data Merge

In [16]:
print("\n=== Final Data Preparation ===")

# Merge portfolio reference data with positions
df_portfolio_market_data = pd.merge(df_portfolio_data, merged_securities)
print(f"Final merged dataset contains {len(df_portfolio_market_data)} records")

# Select final columns for database storage
columns_for_storage = [
    'as_of_date',
    'port_id', 
    'security_id', 
    'held_shares'
]

df_portfolio_market_data = df_portfolio_market_data[columns_for_storage]

print(f"\nüìã Final dataset for storage:")
print(f"   Columns: {list(df_portfolio_market_data.columns)}")
print(f"   Records: {len(df_portfolio_market_data)}")
print(f"   Trading date: {df_portfolio_market_data['as_of_date'].iloc[0] if not df_portfolio_market_data.empty else 'N/A'}")

print("\nSample of final data:")
display(df_portfolio_market_data.head())


=== Final Data Preparation ===
Final merged dataset contains 4 records

üìã Final dataset for storage:
   Columns: ['as_of_date', 'port_id', 'security_id', 'held_shares']
   Records: 4
   Trading date: 2025-09-05

Sample of final data:


Unnamed: 0,as_of_date,port_id,security_id,held_shares
0,2025-09-05,4,80174,2.0
1,2025-09-05,4,99102,1.0
2,2025-09-05,4,159128,2.0
3,2025-09-05,4,61581,1.0


## Database Storage

### Writing Portfolio Holdings

In [17]:
print("\n=== Writing to Database ===")

try:
    # Write portfolio holdings to database
    database.write_portfolio_holdings(df_portfolio_market_data, session)
    print("‚úÖ Portfolio holdings successfully written to database")
    
    # Summary of what was written
    print(f"\nüìà Storage Summary:")
    print(f"   Records written: {len(df_portfolio_market_data)}")
    print(f"   Portfolios: {df_portfolio_market_data['port_id'].nunique()}")
    print(f"   Securities: {df_portfolio_market_data['security_id'].nunique()}")
    print(f"   As of date: {df_portfolio_market_data['as_of_date'].iloc[0]}")
    
except Exception as e:
    print(f"‚ùå Error writing to database: {e}")
    raise


=== Writing to Database ===
‚úÖ Portfolio holdings successfully written to database

üìà Storage Summary:
   Records written: 4
   Portfolios: 1
   Securities: 4
   As of date: 2025-09-05


## Cleanup and Disconnection

In [18]:
print("\n=== Cleaning up connections ===")

try:
    if ib and ib.is_connected():
        ib.disconnect()
        print("‚úÖ IB Gateway connection closed successfully")
    else:
        print("‚ÑπÔ∏è  No active IB connection to close")
        
    # Close database session if needed
    if 'session' in locals():
        session.close()
        print("‚úÖ Database session closed successfully")
        
except Exception as e:
    print(f"‚ö†Ô∏è  Error during cleanup: {e}")

print("\nüéâ Portfolio data loading completed successfully!")


=== Cleaning up connections ===
Disconnected from IB Gateway
‚úÖ IB Gateway connection closed successfully
‚úÖ Database session closed successfully

üéâ Portfolio data loading completed successfully!


## Summary

This notebook demonstrates a complete daily portfolio loading workflow:

### ‚úÖ What This Notebook Does:

1. **Data Extraction**: Connects to Interactive Brokers Gateway and extracts current portfolio positions
2. **Data Validation**: Validates and cleans the extracted data to ensure quality
3. **Security Management**: Automatically handles missing securities by inserting them into the master data
4. **Data Enrichment**: Merges position data with reference data (portfolios and securities)
5. **Database Storage**: Persists the processed portfolio holdings for analysis
6. **Error Handling**: Includes comprehensive error checking and reporting

### üîÑ Automation Features:

- **Trading Day Awareness**: Automatically works with valid trading days
- **Data Quality Checks**: Built-in validation and cleaning processes
- **Missing Data Handling**: Automatic insertion of new securities
- **Connection Management**: Proper cleanup of all connections

### üìä Data Flow:

```
IB Gateway ‚Üí Raw Positions ‚Üí Validation/Cleaning ‚Üí Security Master Sync ‚Üí 
Portfolio Reference Merge ‚Üí Final Dataset ‚Üí Database Storage
```

### üõ†Ô∏è Troubleshooting:

**Common Issues:**
- **IB Connection Failed**: 
  - Check IB Gateway is running
  - Verify API is enabled in IB Gateway settings
  - Ensure port 4001 is available
- **No Portfolio Data**: 
  - Verify account ID is correct
  - Check account has positions
  - Confirm account access permissions
- **Database Errors**: 
  - Check database connectivity
  - Verify credentials in keyring
  - Confirm database schema exists
- **Missing Securities**: 
  - Review security master insert permissions
  - Check security data format
  - Verify symbol mapping logic

### üîÑ Next Steps:

Consider enhancing this workflow with:
- **Scheduling**: Set up automated daily runs
- **Monitoring**: Add logging and alerting
- **Historical Data**: Include historical position tracking
- **Multiple Accounts**: Support for multiple IB accounts
- **Data Quality Metrics**: Enhanced validation reporting