Skip to content

This project utilises Python control flows and Pandas to parse JSON data efficiently. The parsed data is stored in a PostgreSQL database. Basic SQL commands are used to query and export data from the DB.

Notifications You must be signed in to change notification settings

Daniel-Elston/JSON_to_PGSQL

Repository files navigation

pgAdminResult


Python JSON Parsing and PostgreSQL Integration

The purpose of this project is to develop an understanding of JSON file formats and how unstructured text data can be stored in a PostgreSQL database, and used in Python.

For JSON parsing code, please see:
https://github.com/Daniel-Elston/JSON_to_PGSQL/blob/master/Notebooks/B1_JSON_Exploration/json_exploration_3.ipynb

-- Project Status: [Complete]


Project Objective

Textual data is often unstructured and can be extremely messy. Having the ability to appropriately store this form of data is essential for ML model building and generating insights.

The first stage of this project will raw unstructured data in .JSON format will be parsed using Python then stored in a PostgreSQL database. Once the data has been stored in an organised manner, PostgreSQL queries will used to export data ready for processing in Python.

Raw Data

https://www.reddit.com/r/all.json

Technologies

  • Python (JSON data handling)
  • PostGreSQL
  • Libraries: Pandas, NumPy

Methodologies

  • Parsing and handling JSON data
  • Database design and management with PostgreSQL
  • Data processing and analysis using Python libraries (Pandas, NumPy)

Contributing Members

Team Lead: Daniel Elston

Name GitHub Handles
Daniel Elston GitHub D. Elston

Please feel free to contact me if you have any questions, require any further information or wish to contribute.
Email: delstonds@outlook.com

About

This project utilises Python control flows and Pandas to parse JSON data efficiently. The parsed data is stored in a PostgreSQL database. Basic SQL commands are used to query and export data from the DB.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published