Skip to content

unna97/twarc2sql

Repository files navigation

twarc2sql

Documentation Status Tests https://static.pepy.tech/personalized-badge/twarc2sql?period=total&units=international_system&left_color=black&right_color=orange&left_text=Total-Downloads https://static.pepy.tech/personalized-badge/twarc2sql?period=week&units=international_system&left_color=black&right_color=red&left_text=Week-Downloads https://static.pepy.tech/personalized-badge/twarc2sql?period=month&units=international_system&left_color=black&right_color=yellowgreen&left_text=Month-Downloads

This package converts jsonl file generated by twarc2 to sql database in an opnionated way.

Features

  • This package converts jsonl file generated by twarc2 to a postgres sql database in an opnionated way.
  • It creates a database with multiple tables that can be found in the documentation & models.py file.

Installation

You can install twarc2sql using pip:

$ pip install twarc2sql

Usage

import twarc2sql

twarc2sql.connect_to_db_and_upload(
    "folderpath/to/jsonl/file",
    "jsonl_file",
    "twarc_task_type",
    "env_file_with_db_information",
)

Example of env file:

DB_NAME=postgres
DB_USER=postgres
DB_PASSWORD=postgres
DB_HOST=localhost
DB_PORT=5432

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.