This repository provides a Python-based script (main.py) that fetches product data from the Tiki API and saves it into a PostgreSQL database. Designed for stability, resumability, and scalability, capable of processing hundreds of thousands of API calls while handling errors gracefully.
- 🔄 Reads product IDs from a CSV input file
- 🚀 Fetches live product data from Tiki API endpoints
- 🧾 Saves results directly to PostgreSQL
⚠️ Logs all failed requests and exceptions- ♻️ Supports resuming partially completed runs
- ⚙️ Compatible with Supervisord for continuous background operation
| File | Description |
|---|---|
main.py 🐍 |
Main Python script to fetch API data |
product_id.csv 📄 |
Input list of product IDs |
database.ini 🗄️ |
PostgreSQL configuration file |
requirements.txt 📦 |
Python dependencies |
supervisord.conf 🛠️ |
Supervisor configuration for background execution |
Before running the script, you need to prepare your environment and database:
-
Install PostgreSQL Make sure you have a PostgreSQL server running and can create a database for this project.
-
Create a database Example:
CREATE DATABASE tiki_api_data;
CREATE USER your_user WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE tiki_api_data TO your_user;- Database configuration file
[postgresql_tiki]
host=localhost
port=5432
database=tiki_api_data
user=your_user
password=your_password- Optional helper script (connect.py)
-
You don’t strictly need a separate connect.py if your main script reads database.ini and opens a connection internally.
-
If you prefer modularity, you can create a connect.py file:
import psycopg2
from configparser import ConfigParser
def connect(section='postgresql_tiki'):
parser = ConfigParser()
parser.read('database.ini')
db = parser[section]
conn = psycopg2.connect(**db)
return connThen main.py can import connect() from this file.
Clone the repository:
git clone https://github.com/ndlryan/API-Data-with-Postgres.git
cd API-Data-with-PostgresInstall dependencies:
pip install -r requirements.txtRun directly from terminal:
python main.pyThis will:
- Load product IDs from product_id.csv
- Fetch product details from Tiki API
- Save results into PostgreSQL
- Record any failed requests or exceptions in logs
For long-running or auto-restarting crawls, you can manage the crawler with Supervisord.
pip install supervisor[unix_http_server]
file=/tmp/supervisor.sock
[supervisord]
logfile=supervisord.log
pidfile=/tmp/supervisord.pid
childlogdir=./logs
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock
[program:api_data]
command=python3 /path/to/API-Data-with-Postgres/main.py
directory=/path/to/API-Data-with-Postgres
autostart=true
autorestart=true
stderr_logfile=./logs/api_data.err.log
stdout_logfile=./logs/api_data.out.log- 🔧 Replace /path/to/API-Data-Crawling with your actual project path.
supervisord -c supervisord.conf
supervisorctl -c supervisord.conf statusRestart or stop the crawler anytime:
supervisorctl -c supervisord.conf restart api_data
supervisorctl -c supervisord.conf stop api_dataDatabase: PostgreSQL (records inserted from API)
Error logs: Logs exceptions or 404 errors
Supervisor logs: Stored under ./logs/ when using supervisord.conf
EST. Runtime: ~1h
Total Processed: 200,000
- Good Records (Including missing field ones) = 198,942
- Exceptions (404 - Not found) = 1,058Ensure database.ini has correct credentials
Running multiple times does not duplicate records
Use Supervisor to prevent downtime or data loss
Ryan
GitHub Profile
A robust, fault-tolerant Tiki API data loader — lightweight, automated, and production-ready.