A Python framework for integrating with REST APIs to extract, transform, and load data. Includes support for authentication, rate limiting, error handling, and data validation.
- REST API Integration: Easy-to-use framework for API data extraction
- Authentication Support: OAuth2, API keys, Basic Auth
- Rate Limiting: Built-in rate limiting and retry logic
- Error Handling: Comprehensive error handling and logging
- Data Validation: Validate API responses before processing
- Incremental Loading: Support for incremental data extraction
- Multiple Formats: Support for JSON, XML, and CSV responses
- Pagination Support: Automatic handling of paginated APIs
- Python 3.8+
- pip
git clone https://github.com/deepapanicker/api-data-integration.git
cd api-data-integrationpython3 -m venv venv
source venv/bin/activatepip install -r requirements.txtapi-data-integration/
βββ src/
β βββ clients/ # API client implementations
β β βββ base_client.py
β β βββ rest_client.py
β β βββ oauth_client.py
β βββ extractors/ # Data extraction modules
β β βββ api_extractor.py
β β βββ incremental_extractor.py
β βββ transformers/ # Data transformation
β β βββ response_transformer.py
β βββ loaders/ # Data loading
β β βββ database_loader.py
β βββ utils/ # Utilities
β βββ rate_limiter.py
β βββ error_handler.py
βββ config/ # Configuration files
β βββ api_config.yaml.example
βββ examples/ # Example scripts
β βββ basic_extraction.py
β βββ incremental_sync.py
β βββ oauth_example.py
βββ tests/ # Unit tests
βββ ARCHITECTURE.md # Architecture diagram and flow
βββ requirements.txt
βββ README.md
from src.clients.rest_client import RESTClient
from src.extractors.api_extractor import APIExtractor
# Initialize client
client = RESTClient(
base_url='https://api.example.com',
api_key='your-api-key'
)
# Extract data
extractor = APIExtractor(client)
data = extractor.extract(endpoint='/customers', params={'limit': 100})
# Process data
for record in data:
print(record)from src.extractors.incremental_extractor import IncrementalExtractor
extractor = IncrementalExtractor(
client=client,
endpoint='/orders',
timestamp_field='updated_at',
last_sync_file='last_sync.json'
)
# Extract only new/updated records
new_data = extractor.extract_incremental()from src.clients.oauth_client import OAuthClient
client = OAuthClient(
base_url='https://api.example.com',
client_id='your-client-id',
client_secret='your-client-secret',
token_url='https://api.example.com/oauth/token'
)
data = client.get('/protected-endpoint')Edit config/api_config.yaml:
apis:
example_api:
base_url: https://api.example.com
authentication:
type: api_key
header: X-API-Key
value: ${API_KEY}
rate_limit:
requests_per_second: 10
retry:
max_retries: 3
backoff_factor: 2from src.clients.rest_client import RESTClient
from src.extractors.api_extractor import APIExtractor
from src.transformers.response_transformer import ResponseTransformer
from src.loaders.database_loader import DatabaseLoader
# 1. Extract
client = RESTClient(base_url='https://api.example.com', api_key='key')
extractor = APIExtractor(client)
raw_data = extractor.extract('/customers', pagination=True)
# 2. Transform
transformer = ResponseTransformer()
transformed_data = transformer.transform(
raw_data,
field_mapping={'customer_id': 'id', 'customer_name': 'name'}
)
# 3. Load
loader = DatabaseLoader(connection_string='postgresql://...')
result = loader.load(
data=transformed_data,
table_name='customers',
load_mode='upsert',
unique_key='id'
)from src.utils.rate_limiter import RateLimiter
rate_limiter = RateLimiter(requests_per_second=10)
for endpoint in endpoints:
rate_limiter.wait_if_needed() # Respect rate limits
data = extractor.extract(endpoint)from src.utils.error_handler import ErrorHandler
error_handler = ErrorHandler(log_file='errors.log')
try:
data = extractor.extract('/endpoint')
except Exception as e:
error_info = error_handler.handle_error(e, context={'endpoint': '/endpoint'})
# Error logged and tracked- Architecture Details: See ARCHITECTURE.md for detailed architecture diagrams and data flow
- Visual Diagrams: See DIAGRAM.md for comprehensive visual diagrams showing system components and data flow
# Basic extraction
python examples/basic_extraction.py
# Incremental sync
python examples/incremental_sync.py
# OAuth2 example
python examples/oauth_example.pypytest tests/- Architecture: See ARCHITECTURE.md for system design
- Examples: See
examples/directory for usage examples - API Reference: See docstrings in source files
- Store API keys in environment variables
- Use OAuth2 for production APIs
- Implement proper error handling
- Log sensitive operations appropriately
- Use HTTPS for all API communications
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
MIT License
Deepa Govinda Panicker
- GitHub: @deepapanicker
- Portfolio: deepapanicker.com