Skip to content

deepapanicker/api-data-integration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

API Data Integration

A Python framework for integrating with REST APIs to extract, transform, and load data. Includes support for authentication, rate limiting, error handling, and data validation.

🎯 Features

  • REST API Integration: Easy-to-use framework for API data extraction
  • Authentication Support: OAuth2, API keys, Basic Auth
  • Rate Limiting: Built-in rate limiting and retry logic
  • Error Handling: Comprehensive error handling and logging
  • Data Validation: Validate API responses before processing
  • Incremental Loading: Support for incremental data extraction
  • Multiple Formats: Support for JSON, XML, and CSV responses
  • Pagination Support: Automatic handling of paginated APIs

πŸ“‹ Prerequisites

  • Python 3.8+
  • pip

πŸ› οΈ Installation

1. Clone the repository

git clone https://github.com/deepapanicker/api-data-integration.git
cd api-data-integration

2. Create virtual environment

python3 -m venv venv
source venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

πŸ“ Project Structure

api-data-integration/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ clients/           # API client implementations
β”‚   β”‚   β”œβ”€β”€ base_client.py
β”‚   β”‚   β”œβ”€β”€ rest_client.py
β”‚   β”‚   └── oauth_client.py
β”‚   β”œβ”€β”€ extractors/        # Data extraction modules
β”‚   β”‚   β”œβ”€β”€ api_extractor.py
β”‚   β”‚   └── incremental_extractor.py
β”‚   β”œβ”€β”€ transformers/      # Data transformation
β”‚   β”‚   └── response_transformer.py
β”‚   β”œβ”€β”€ loaders/           # Data loading
β”‚   β”‚   └── database_loader.py
β”‚   └── utils/             # Utilities
β”‚       β”œβ”€β”€ rate_limiter.py
β”‚       └── error_handler.py
β”œβ”€β”€ config/                # Configuration files
β”‚   └── api_config.yaml.example
β”œβ”€β”€ examples/              # Example scripts
β”‚   β”œβ”€β”€ basic_extraction.py
β”‚   β”œβ”€β”€ incremental_sync.py
β”‚   └── oauth_example.py
β”œβ”€β”€ tests/                 # Unit tests
β”œβ”€β”€ ARCHITECTURE.md        # Architecture diagram and flow
β”œβ”€β”€ requirements.txt
└── README.md

πŸš€ Quick Start

Basic API Extraction

from src.clients.rest_client import RESTClient
from src.extractors.api_extractor import APIExtractor

# Initialize client
client = RESTClient(
    base_url='https://api.example.com',
    api_key='your-api-key'
)

# Extract data
extractor = APIExtractor(client)
data = extractor.extract(endpoint='/customers', params={'limit': 100})

# Process data
for record in data:
    print(record)

Incremental Extraction

from src.extractors.incremental_extractor import IncrementalExtractor

extractor = IncrementalExtractor(
    client=client,
    endpoint='/orders',
    timestamp_field='updated_at',
    last_sync_file='last_sync.json'
)

# Extract only new/updated records
new_data = extractor.extract_incremental()

OAuth2 Authentication

from src.clients.oauth_client import OAuthClient

client = OAuthClient(
    base_url='https://api.example.com',
    client_id='your-client-id',
    client_secret='your-client-secret',
    token_url='https://api.example.com/oauth/token'
)

data = client.get('/protected-endpoint')

πŸ”§ Configuration

API Configuration

Edit config/api_config.yaml:

apis:
  example_api:
    base_url: https://api.example.com
    authentication:
      type: api_key
      header: X-API-Key
      value: ${API_KEY}
    rate_limit:
      requests_per_second: 10
    retry:
      max_retries: 3
      backoff_factor: 2

πŸ“Š Usage Examples

Complete ETL Pipeline

from src.clients.rest_client import RESTClient
from src.extractors.api_extractor import APIExtractor
from src.transformers.response_transformer import ResponseTransformer
from src.loaders.database_loader import DatabaseLoader

# 1. Extract
client = RESTClient(base_url='https://api.example.com', api_key='key')
extractor = APIExtractor(client)
raw_data = extractor.extract('/customers', pagination=True)

# 2. Transform
transformer = ResponseTransformer()
transformed_data = transformer.transform(
    raw_data,
    field_mapping={'customer_id': 'id', 'customer_name': 'name'}
)

# 3. Load
loader = DatabaseLoader(connection_string='postgresql://...')
result = loader.load(
    data=transformed_data,
    table_name='customers',
    load_mode='upsert',
    unique_key='id'
)

Rate Limiting

from src.utils.rate_limiter import RateLimiter

rate_limiter = RateLimiter(requests_per_second=10)

for endpoint in endpoints:
    rate_limiter.wait_if_needed()  # Respect rate limits
    data = extractor.extract(endpoint)

Error Handling

from src.utils.error_handler import ErrorHandler

error_handler = ErrorHandler(log_file='errors.log')

try:
    data = extractor.extract('/endpoint')
except Exception as e:
    error_info = error_handler.handle_error(e, context={'endpoint': '/endpoint'})
    # Error logged and tracked

πŸ—οΈ Architecture

  • Architecture Details: See ARCHITECTURE.md for detailed architecture diagrams and data flow
  • Visual Diagrams: See DIAGRAM.md for comprehensive visual diagrams showing system components and data flow

πŸ“ Examples

Run Examples

# Basic extraction
python examples/basic_extraction.py

# Incremental sync
python examples/incremental_sync.py

# OAuth2 example
python examples/oauth_example.py

πŸ§ͺ Testing

pytest tests/

πŸ“š Documentation

  • Architecture: See ARCHITECTURE.md for system design
  • Examples: See examples/ directory for usage examples
  • API Reference: See docstrings in source files

πŸ”’ Security Best Practices

  • Store API keys in environment variables
  • Use OAuth2 for production APIs
  • Implement proper error handling
  • Log sensitive operations appropriately
  • Use HTTPS for all API communications

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

πŸ“ License

MIT License

πŸ‘€ Author

Deepa Govinda Panicker

About

Python framework for REST API data extraction and integration

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages