Tiki API Data to PostgreSQL

Overview

This repository provides a Python-based script (main.py) that fetches product data from the Tiki API and saves it into a PostgreSQL database. Designed for stability, resumability, and scalability, capable of processing hundreds of thousands of API calls while handling errors gracefully.

Features

🔄 Reads product IDs from a CSV input file
🚀 Fetches live product data from Tiki API endpoints
🧾 Saves results directly to PostgreSQL
⚠️ Logs all failed requests and exceptions
♻️ Supports resuming partially completed runs
⚙️ Compatible with Supervisord for continuous background operation

Project Structure

File	Description
`main.py` 🐍	Main Python script to fetch API data
`product_id.csv` 📄	Input list of product IDs
`database.ini` 🗄️	PostgreSQL configuration file
`requirements.txt` 📦	Python dependencies
`supervisord.conf` 🛠️	Supervisor configuration for background execution

Setup

Before running the script, you need to prepare your environment and database:

Install PostgreSQL Make sure you have a PostgreSQL server running and can create a database for this project.
Create a database Example:

CREATE DATABASE tiki_api_data;
CREATE USER your_user WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE tiki_api_data TO your_user;

Database configuration file

[postgresql_tiki]
host=localhost
port=5432
database=tiki_api_data
user=your_user
password=your_password

Optional helper script (connect.py)

You don’t strictly need a separate connect.py if your main script reads database.ini and opens a connection internally.
If you prefer modularity, you can create a connect.py file:

import psycopg2
from configparser import ConfigParser

def connect(section='postgresql_tiki'):
    parser = ConfigParser()
    parser.read('database.ini')
    db = parser[section]
    conn = psycopg2.connect(**db)
    return conn

Then main.py can import connect() from this file.

Installation

Clone the repository:

git clone https://github.com/ndlryan/API-Data-with-Postgres.git
cd API-Data-with-Postgres

Install dependencies:

pip install -r requirements.txt

Running the Crawler

Run directly from terminal:

python main.py

This will:

Load product IDs from product_id.csv
Fetch product details from Tiki API
Save results into PostgreSQL
Record any failed requests or exceptions in logs

Process Management with Supervisord

For long-running or auto-restarting crawls, you can manage the crawler with Supervisord.

1. Install Supervisor

pip install supervisor

2. Create Configuration File

[unix_http_server]
file=/tmp/supervisor.sock

[supervisord]
logfile=supervisord.log
pidfile=/tmp/supervisord.pid
childlogdir=./logs

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl=unix:///tmp/supervisor.sock

[program:api_data]
command=python3 /path/to/API-Data-with-Postgres/main.py
directory=/path/to/API-Data-with-Postgres
autostart=true
autorestart=true
stderr_logfile=./logs/api_data.err.log
stdout_logfile=./logs/api_data.out.log

🔧 Replace /path/to/API-Data-Crawling with your actual project path.

3. Start and Monitor

supervisord -c supervisord.conf
supervisorctl -c supervisord.conf status

Restart or stop the crawler anytime:

supervisorctl -c supervisord.conf restart api_data
supervisorctl -c supervisord.conf stop api_data

Logs and Outputs

Database: PostgreSQL (records inserted from API)

Error logs: Logs exceptions or 404 errors

Supervisor logs: Stored under ./logs/ when using supervisord.conf

Summary

EST. Runtime: ~1h
Total Processed: 200,000
    - Good Records (Including missing field ones) = 198,942
    - Exceptions (404 - Not found) = 1,058

Notes

Ensure database.ini has correct credentials

Running multiple times does not duplicate records

Use Supervisor to prevent downtime or data loss

Author

Ryan
GitHub Profile

A robust, fault-tolerant Tiki API data loader — lightweight, automated, and production-ready.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.idea		.idea
.venv		.venv
README.md		README.md
config.py		config.py
connect.py		connect.py
database.ini		database.ini
errors.log		errors.log
main.py		main.py
product_id.csv		product_id.csv
requirement.txt		requirement.txt
supervisord.conf		supervisord.conf
supervisord.log		supervisord.log
tiki_error.py		tiki_error.py
tiki_product.py		tiki_product.py
tiki_queue.py		tiki_queue.py
tiki_scraper.log		tiki_scraper.log
tiki_scraper_err.log		tiki_scraper_err.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tiki API Data to PostgreSQL

Overview

Features

Project Structure

Setup

Installation

Running the Crawler

Process Management with Supervisord

1. Install Supervisor

2. Create Configuration File

3. Start and Monitor

Logs and Outputs

Summary

Notes

Author

About

Uh oh!

Releases

Packages

Languages

ndlryan/API-Data-with-Postgres

Folders and files

Latest commit

History

Repository files navigation

Tiki API Data to PostgreSQL

Overview

Features

Project Structure

Setup

Installation

Running the Crawler

Process Management with Supervisord

1. Install Supervisor

2. Create Configuration File

3. Start and Monitor

Logs and Outputs

Summary

Notes

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages