Skip to content

jh-ecomp/jira-etl

Repository files navigation

A simple lib to start scraping your jira cards

GitHub language count Repository size GitHub last commit License Jira

Status: Under Development

AboutHow it worksDependenciesInstallationUsingAuthorLicense

About

Jira ETL is a python jira etl script to allow you to extract your issues, manipulate and store them in your data warehouse


How it works

This project is a python script that collects issues from Jira through its Rest API. Using Postgresql to store the jira data,
I choose SQLAlchemy and Psycopg to manipulate my data warehouse.

Edit the config files to your instance of Jira to start the ETL.
This script is modular, so feel free to forget the whole database thing and build your own.

Pre-requisites

Before you begin, you will need to have the following tools installed on your machine:
[Git] (https://git-scm.com), [Python 3.6+] (https://https://www.python.org/), [PostgreSQL] (https://www.postgresql.org/).
As a suggestion, use a modern editor like [Pycharm] (https://www.jetbrains.com/pycharm/) or [VSCode ] (https://code.visualstudio.com/)

Dependencies

  
# SQLAlchemy  
# Psycopg2  
# Requests  
  

Installation

  
# Clone this repository  
$ git clone git@github.com:jh-ecomp/jira-etl.git  
  
# Access the project folder cmd/terminal  
$ cd jira-etl  
  
# Install dependencies  
$ pip install -r requirements.txt  
  

Using

This project is the basis for your application.

For hobby purposes it is necessary to change:

  • jira.config file, inserting the parameters of your Jira instance and the Query you want to do;
  • database.config file, inserting the parameters for communication with your database;
  • jira_requests.py file, inserting your credentials for HTTPBasicAuth
  • db_connection.py file, inserting your credentials for acess the database.

For professional purpose more things need to be changed, like:

  • jira.config file, inserting the parameters of your Jira instance and the Query you want to do;
  • database.config file, inserting the parameters for communication with your database;
  • change authentication mode on jira_requests.py
  • decrypt database credentials in db_connection.py
  
# Import the lib  
from jiraetl import get_issues, process_issues, store_issues  
  
# Call get_issues to download a list of json with the issues as described in jira.config  
issues = get_issues()  
  
# Process the issues to carry out the necessary transformations and cleaning  
processed_issues = process_issues(issues)  
  
# Finally, store the data  
store_issues(processed_issues)  
#  

Author

João Henrique
João Henrique

---

License

This project is under the license MIT.

Made by João Henrique 👋🏽 Get in Touch!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages