# Upload Files to SharePoint using Microsoft Graph API

This notebook demonstrates how to upload files to Microsoft SharePoint using **Microsoft Graph API** with Azure AD App registration.

This approach uses direct HTTP requests to the Graph API, which is more reliable than the SharePoint REST API for app-only authentication.

## Installation

Install the required libraries:

```bash
pip install requests python-dotenv pandas pyarrow openpyxl
```

**What these libraries do:**
- `requests` - HTTP library for making Graph API calls
- `python-dotenv` - Load sensitive credentials from .env file (security best practice)
- `pandas`, `pyarrow`, `openpyxl` - For creating sample data files (CSV, Parquet, Excel)

## 1. Import Libraries

Import the necessary modules for Microsoft Graph API authentication and file operations.

In [1]:
import requests
import json
import os
from dotenv import load_dotenv
from urllib.parse import quote

# For creating sample data
import pandas as pd
import io

print("‚úÖ Libraries imported successfully")

‚úÖ Libraries imported successfully


## 2. Azure App Registration Setup

Before using this notebook, you need to register an Azure AD application. Follow these steps:

### Step 1: Register an Azure App

1. Go to [Azure Portal](https://portal.azure.com)
2. Navigate to **Azure Active Directory** ‚Üí **App registrations**
3. Click **New registration**
4. Enter details:
   - **Name**: Python SharePoint Upload Script
   - **Supported account types**: Accounts in this organizational directory only
   - **Redirect URI**: Leave blank
5. Click **Register**
6. **Copy and save**:
   - **Application (client) ID**
   - **Directory (tenant) ID**

### Step 2: Create Client Secret

1. In your app, go to **Certificates & secrets** (left sidebar)
2. Click **New client secret**
3. Add description: "Python Script Secret"
4. Choose expiration: 6 months, 12 months, or 24 months
5. Click **Add**
6. **IMMEDIATELY COPY THE VALUE** (the long string with special characters)
   - ‚ö†Ô∏è You can only see this once!
   - ‚úÖ Use the **Value**, not the **Secret ID**

### Step 3: Grant API Permissions

1. Go to **API permissions** (left sidebar)
2. Click **Add a permission**
3. Select **Microsoft Graph**
4. Select **Application permissions** (NOT Delegated!) ‚ö†Ô∏è
5. Search for and add: `Sites.ReadWrite.All`
6. Click **Add permissions**
7. Click **Grant admin consent for [Your Organization]**
8. Confirm by clicking **Yes**
9. Verify green checkmarks appear: ‚úÖ Granted for [Your Org]

### Step 4: Create .env File

Create a `.env` file in the same directory as this notebook with these values:

```env
TENANT_ID=your-tenant-id-here
CLIENT_ID=your-client-id-here
SECRET_VALUE=your-secret-value-here
SHAREPOINT_SITE_URL=https://yourcompany.sharepoint.com/sites/yoursite
```

**Security tips:**
- Add `.env` to your `.gitignore` file
- Never commit credentials to version control
- Use the secret VALUE (long string), not the Secret ID (UUID format)

## 3. Load Configuration

Load your Azure AD credentials from the .env file.

In [2]:
# Load credentials from .env file
load_dotenv()

# Get Azure App credentials from environment variables
TENANT_ID = os.getenv('TENANT_ID', 'your-tenant-id')
CLIENT_ID = os.getenv('CLIENT_ID', 'your-client-id')
CLIENT_SECRET = os.getenv('SECRET_VALUE', 'your-client-secret')
SHAREPOINT_SITE_URL = os.getenv('SHAREPOINT_SITE_URL', 'https://yourcompany.sharepoint.com/sites/yoursite')

# Clean URL (remove trailing slashes and document paths)
SHAREPOINT_SITE_URL = SHAREPOINT_SITE_URL.rstrip('/').split('/Shared')[0].split('/Documents')[0]

# Verify credentials are loaded (don't print actual values!)
print("üîê Configuration loaded:")
print(f"   SharePoint Site: {SHAREPOINT_SITE_URL}")
print(f"   Tenant ID: {TENANT_ID[:8]}..." if len(TENANT_ID) > 8 else f"   Tenant ID: {TENANT_ID}")
print(f"   Client ID: {CLIENT_ID[:8]}..." if len(CLIENT_ID) > 8 else f"   Client ID: {CLIENT_ID}")
print(f"   Client Secret: {'*' * min(len(CLIENT_SECRET), 20)}")

if CLIENT_ID == 'your-client-id':
    print("\n‚ö†Ô∏è  WARNING: Using default credentials. Update your .env file!")

üîê Configuration loaded:
   SharePoint Site: https://yusufmartin.sharepoint.com/sites/site0
   Tenant ID: f06e42e3...
   Client ID: 4c9b2a60...
   Client Secret: ********************


## 4. Define Graph API Client Class

Create a simple client class to handle Microsoft Graph API operations for SharePoint.

In [3]:
class GraphSharePointClient:
    """Simple client for SharePoint operations via Microsoft Graph API"""
    
    def __init__(self, tenant_id, client_id, client_secret):
        self.tenant_id = tenant_id
        self.client_id = client_id
        self.client_secret = client_secret
        self.token = None
        
    def get_token(self):
        """Get access token from Azure AD"""
        token_url = f"https://login.microsoftonline.com/{self.tenant_id}/oauth2/v2.0/token"
        
        token_data = {
            'grant_type': 'client_credentials',
            'client_id': self.client_id,
            'client_secret': self.client_secret,
            'scope': 'https://graph.microsoft.com/.default'
        }
        
        response = requests.post(token_url, data=token_data)
        response.raise_for_status()
        self.token = response.json()['access_token']
        return self.token
    
    def get_headers(self):
        """Get authorization headers for requests"""
        if not self.token:
            self.get_token()
        return {
            'Authorization': f'Bearer {self.token}',
            'Content-Type': 'application/json'
        }
    
    def get_site_id(self, site_url):
        """
        Get SharePoint site ID from URL
        Example: https://yusufmartin.sharepoint.com/sites/site0
        """
        # Extract hostname and site path
        # Format: hostname:/sites/sitename
        parts = site_url.replace('https://', '').split('/')
        hostname = parts[0]
        site_path = '/'.join(parts[1:])  # e.g., 'sites/site0'
        
        graph_url = f"https://graph.microsoft.com/v1.0/sites/{hostname}:/{site_path}"
        
        response = requests.get(graph_url, headers=self.get_headers())
        response.raise_for_status()
        
        site_data = response.json()
        print(f"‚úÖ Found site: {site_data['displayName']}")
        print(f"   Site ID: {site_data['id']}")
        return site_data['id']
    
    def upload_file(self, site_id, folder_path, filename, file_content):
        """
        Upload file to SharePoint
        
        Args:
            site_id: SharePoint site ID
            folder_path: Folder path (e.g., 'Shared Documents' or 'Documents')
            filename: Name for the file
            file_content: File content as bytes
        """
        # Graph API endpoint for uploading files
        # Format: /sites/{site-id}/drive/root:/{folder}/{filename}:/content
        upload_url = f"https://graph.microsoft.com/v1.0/sites/{site_id}/drive/root:/{folder_path}/{filename}:/content"
        
        headers = {
            'Authorization': f'Bearer {self.token}',
            'Content-Type': 'application/octet-stream'
        }
        
        response = requests.put(upload_url, headers=headers, data=file_content)
        response.raise_for_status()
        
        file_data = response.json()
        return file_data
    
    def list_files(self, site_id, folder_path):
        """List files in a SharePoint folder"""
        list_url = f"https://graph.microsoft.com/v1.0/sites/{site_id}/drive/root:/{folder_path}:/children"
        
        response = requests.get(list_url, headers=self.get_headers())
        response.raise_for_status()
        
        return response.json()['value']
    
    def download_file(self, site_id, folder_path, filename):
        """Download file from SharePoint"""
        download_url = f"https://graph.microsoft.com/v1.0/sites/{site_id}/drive/root:/{folder_path}/{filename}:/content"
        
        response = requests.get(download_url, headers=self.get_headers())
        response.raise_for_status()
        
        return response.content
    
    def delete_file(self, site_id, folder_path, filename):
        """Delete file from SharePoint"""
        delete_url = f"https://graph.microsoft.com/v1.0/sites/{site_id}/drive/root:/{folder_path}/{filename}"
        
        response = requests.delete(delete_url, headers=self.get_headers())
        response.raise_for_status()
        
        return True

print("‚úÖ GraphSharePointClient class defined")

‚úÖ GraphSharePointClient class defined


## 5. Initialize Client and Get Site ID

Create a client instance and retrieve the SharePoint site ID.

In [4]:
# Initialize the Graph API client
client = GraphSharePointClient(TENANT_ID, CLIENT_ID, CLIENT_SECRET)

# Get access token
try:
    token = client.get_token()
    print("‚úÖ Successfully obtained access token!")
    print(f"   Token length: {len(token)} characters")
except Exception as e:
    print(f"‚ùå Failed to get token: {e}")
    print("\nüîß Troubleshooting:")
    print("   1. Verify Client ID, Tenant ID, and Secret are correct in .env")
    print("   2. Check that the secret VALUE (not ID) is being used")

# Get the SharePoint site ID
try:
    site_id = client.get_site_id(SHAREPOINT_SITE_URL)
    print("\n‚úÖ Successfully connected to SharePoint site")
except Exception as e:
    print(f"\n‚ùå Failed to get site: {e}")
    print("\nüîß Troubleshooting:")
    print("   1. Verify your SHAREPOINT_SITE_URL is correct")
    print("   2. Make sure you have Sites.Read.All or Sites.ReadWrite.All permission")
    print("   3. Ensure admin consent was granted")
    print("   4. Wait 5-10 minutes after granting consent")

‚úÖ Successfully obtained access token!
   Token length: 1906 characters
‚úÖ Found site: site0
   Site ID: yusufmartin.sharepoint.com,ffb88eb7-d591-4d19-8e12-7170dd7956fe,7c24ce8d-c533-44f3-a6cb-e65c9730781b

‚úÖ Successfully connected to SharePoint site
‚úÖ Found site: site0
   Site ID: yusufmartin.sharepoint.com,ffb88eb7-d591-4d19-8e12-7170dd7956fe,7c24ce8d-c533-44f3-a6cb-e65c9730781b

‚úÖ Successfully connected to SharePoint site


## 6. Create Sample Data

Let's create sample data files to upload to SharePoint.

In [5]:
# Create sample employee data
data = {
    'employee_id': [1, 2, 3, 4, 5],
    'name': ['Alice Johnson', 'Bob Smith', 'Charlie Brown', 'Diana Prince', 'Eve Davis'],
    'department': ['Engineering', 'Sales', 'Engineering', 'HR', 'Sales'],
    'salary': [95000, 65000, 88000, 72000, 70000]
}

df = pd.DataFrame(data)

print("Sample data created:")
print(df)

# Save to local Parquet file
local_filename = 'employees_sample.parquet'
df.to_parquet(local_filename, engine='pyarrow', compression='snappy', index=False)
print(f"\n‚úÖ Saved to local file: {local_filename}")

Sample data created:
   employee_id           name   department  salary
0            1  Alice Johnson  Engineering   95000
1            2      Bob Smith        Sales   65000
2            3  Charlie Brown  Engineering   88000
3            4   Diana Prince           HR   72000
4            5      Eve Davis        Sales   70000

‚úÖ Saved to local file: employees_sample.parquet


## 7. Upload File to SharePoint

Upload a file from your local filesystem to SharePoint.

**Common folder paths:**
- `Shared Documents` - Default document library
- `Documents` - Alternative document library
- `Shared Documents/SubFolder` - Subfolder in document library

In [7]:
# Specify the SharePoint folder path
# Try these options if one doesn't work:
# - 'Shared Documents'
# - 'Documents' 
# - '' (empty string for root)
FOLDER_PATH = 'Shared Documents'

try:
    # Read the local file
    with open(local_filename, 'rb') as f:
        file_content = f.read()
    
    # Upload to SharePoint
    result = client.upload_file(
        site_id=site_id,
        folder_path=FOLDER_PATH,
        filename=local_filename,
        file_content=file_content
    )
    
    print(f"‚úÖ File uploaded successfully!")
    print(f"   Name: {result['name']}")
    print(f"   Size: {result['size'] / 1024:.2f} KB")
    print(f"   Web URL: {result.get('webUrl', 'N/A')}")
    
except requests.exceptions.HTTPError as e:
    print(f"‚ùå Upload failed: {e}")
    print(f"   Status code: {e.response.status_code}")
    print(f"   Response: {e.response.text}")
    
    if e.response.status_code == 404:
        print("\nüîß Folder not found. Try these alternatives:")
        print("   - FOLDER_PATH = 'Documents'")
        print("   - FOLDER_PATH = ''  # Root folder")
    elif e.response.status_code == 403:
        print("\nüîß Permission denied. Ensure you have:")
        print("   - Microsoft Graph API permission: Sites.ReadWrite.All")
        print("   - Admin consent granted")
except Exception as e:
    print(f"‚ùå Upload failed: {e}")

‚úÖ File uploaded successfully!
   Name: employees_sample.parquet
   Size: 2.92 KB
   Web URL: https://yusufmartin.sharepoint.com/sites/site0/Shared%20Documents/Shared%20Documents/employees_sample.parquet


## 8. Upload File from Memory

Upload data directly from memory without saving to disk first.

In [8]:
# Create new data (could be from database, API, etc.)
new_data = pd.DataFrame({
    'employee_id': [6, 7, 8],
    'name': ['Frank Miller', 'Grace Lee', 'Henry Wilson'],
    'department': ['Engineering', 'HR', 'Sales'],
    'salary': [105000, 68000, 75000]
})

print("New data to upload:")
print(new_data)

# Create in-memory buffer
buffer = io.BytesIO()
new_data.to_parquet(buffer, engine='pyarrow', compression='snappy', index=False)
file_bytes = buffer.getvalue()

print(f"\nGenerated {len(file_bytes)} bytes in memory")

# Upload directly from memory
try:
    filename = 'employees_new.parquet'
    result = client.upload_file(
        site_id=site_id,
        folder_path=FOLDER_PATH,
        filename=filename,
        file_content=file_bytes
    )
    
    print(f"\n‚úÖ File uploaded from memory!")
    print(f"   Name: {result['name']}")
    print(f"   Size: {result['size'] / 1024:.2f} KB")
    
except Exception as e:
    print(f"‚ùå Upload failed: {e}")

New data to upload:
   employee_id          name   department  salary
0            6  Frank Miller  Engineering  105000
1            7     Grace Lee           HR   68000
2            8  Henry Wilson        Sales   75000

Generated 2938 bytes in memory

‚úÖ File uploaded from memory!
   Name: employees_new.parquet
   Size: 2.87 KB

‚úÖ File uploaded from memory!
   Name: employees_new.parquet
   Size: 2.87 KB


## 9. Upload Different File Types

The same approach works for CSV, Excel, JSON, images, etc.

In [9]:
# Example 1: Upload CSV file
csv_buffer = io.StringIO()
df.to_csv(csv_buffer, index=False)
csv_bytes = csv_buffer.getvalue().encode('utf-8')

# Example 2: Upload Excel file
excel_buffer = io.BytesIO()
df.to_excel(excel_buffer, index=False, engine='openpyxl')
excel_bytes = excel_buffer.getvalue()

# Example 3: Upload JSON file
json_str = df.to_json(orient='records', indent=2)
json_bytes = json_str.encode('utf-8')

# Upload all three files
try:
    # Upload CSV
    client.upload_file(site_id, FOLDER_PATH, 'employees.csv', csv_bytes)
    print("‚úÖ CSV uploaded")
    
    # Upload Excel
    client.upload_file(site_id, FOLDER_PATH, 'employees.xlsx', excel_bytes)
    print("‚úÖ Excel uploaded")
    
    # Upload JSON
    client.upload_file(site_id, FOLDER_PATH, 'employees.json', json_bytes)
    print("‚úÖ JSON uploaded")
    
    print(f"\n‚úÖ All files uploaded to SharePoint!")
    
except Exception as e:
    print(f"‚ùå Upload failed: {e}")

ModuleNotFoundError: No module named 'openpyxl'

## 10. List Files in SharePoint Folder

Verify your uploads by listing all files in a SharePoint folder.

In [10]:
try:
    files = client.list_files(site_id, FOLDER_PATH)
    
    print(f"üìÅ Files in {FOLDER_PATH}:\n")
    print(f"{'Filename':<40} {'Size (KB)':<12} {'Modified':<20} {'Type'}")
    print("-" * 90)
    
    for item in files:
        if 'file' in item:  # Only show files, not folders
            name = item['name']
            size = item['size'] / 1024
            modified = item['lastModifiedDateTime'][:10]  # Just the date
            file_type = item['name'].split('.')[-1] if '.' in item['name'] else 'N/A'
            print(f"{name:<40} {size:>10.2f} KB {modified:<20} {file_type}")
    
    file_count = sum(1 for f in files if 'file' in f)
    folder_count = sum(1 for f in files if 'folder' in f)
    print(f"\nTotal: {file_count} files, {folder_count} folders")
    
except Exception as e:
    print(f"‚ùå Failed to list files: {e}")

üìÅ Files in Shared Documents:

Filename                                 Size (KB)    Modified             Type
------------------------------------------------------------------------------------------
employees_new.parquet                          2.87 KB 2026-01-21           parquet
employees_sample.parquet                       2.92 KB 2026-01-21           parquet

Total: 2 files, 0 folders


## 11. Download File from SharePoint

Download a file from SharePoint back to your local machine or into memory.

In [None]:
# Specify the file to download
file_to_download = 'employees_sample.parquet'

try:
    # Download file content
    file_content = client.download_file(site_id, FOLDER_PATH, file_to_download)
    
    # Save to local file
    download_path = f'downloaded_{file_to_download}'
    with open(download_path, 'wb') as local_file:
        local_file.write(file_content)
    
    print(f"‚úÖ File downloaded: {download_path}")
    print(f"   Size: {len(file_content) / 1024:.2f} KB")
    
    # Or read directly into pandas
    df_downloaded = pd.read_parquet(io.BytesIO(file_content))
    print(f"\nDownloaded data (first 3 rows):")
    print(df_downloaded.head(3))
    
except Exception as e:
    print(f"‚ùå Download failed: {e}")

## 12. Delete File from SharePoint

Remove files from SharePoint (use with caution!).

In [None]:
# Specify file to delete
file_to_delete = 'employees_new.parquet'

# Safety flag - set to True to enable deletion
delete_enabled = False

if delete_enabled:
    try:
        client.delete_file(site_id, FOLDER_PATH, file_to_delete)
        print(f"‚úÖ File deleted: {file_to_delete}")
        
    except Exception as e:
        print(f"‚ùå Deletion failed: {e}")
else:
    print("‚ö†Ô∏è  File deletion is disabled. Set delete_enabled=True to enable.")
    print(f"   File to delete: {file_to_delete}")

## Summary

This notebook demonstrated:

‚úÖ **Microsoft Graph API Authentication** - Using Azure AD app credentials

‚úÖ **Upload files** - From disk or directly from memory

‚úÖ **Multiple file formats** - Parquet, CSV, Excel, JSON

‚úÖ **List files** - View contents of SharePoint folders

‚úÖ **Download files** - Retrieve files from SharePoint

‚úÖ **Delete files** - Remove files from SharePoint

### Key Advantages of Graph API:

- **Modern Authentication** - Works with MFA and Conditional Access
- **Reliable** - More stable than SharePoint REST API
- **Simple HTTP Requests** - No complex SDK dependencies
- **Cross-platform** - Works on Windows, Mac, Linux
- **Well-documented** - Extensive Microsoft documentation

### Common Folder Paths:

If one doesn't work, try these alternatives:
- `Shared Documents`
- `Documents`
- `Shared Documents/SubFolder`
- `''` (empty string for root)

### Troubleshooting:

**404 Error (Not Found):**
- Check folder path spelling
- Try different folder path variations
- Verify site URL is correct

**403 Error (Forbidden):**
- Ensure `Sites.ReadWrite.All` permission is added
- Verify admin consent was granted
- Wait 5-10 minutes after granting consent

**401 Error (Unauthorized):**
- Check client secret is the VALUE (not ID)
- Verify tenant ID and client ID are correct
- Ensure permission type is "Application" (not Delegated)

### Next Steps:

- Integrate with data pipelines (ETL workflows)
- Schedule uploads using cron jobs or Azure Functions
- Add error notifications (email, Slack, Teams)
- Implement retry logic for failed uploads
- Use with version control for data versioning