This repository contains the data pipeline and analysis to generate a daily key performance indicator report on Tunisair flight delays.
The generated report includes:
- The count of departure delays that Tunisair has made
- The min, max, and average delays for departures and arrival (in minutes)
- A bar chart to compare performance between Tunisair, Nouvelair, Airfrance (& Transavia)
- The report will be published daily at 9 a.m Europe/Paris timezone on the Twitter account @Tunisairalert
- The tasks are scheduled to run hourly from 7 am to midnight using CRON.
- An API request is made to Airlabs to gather data in JSON format.
- The JSON data is cleaned, enriched and saved into a SQLite3 database
tunisair_delay.db
- The airport data is enriched using the pyairpots module, thanks to NICTA for providing the module.
This process ensures that the data is always up-to-date and accurate, allowing for the most accurate analysis of flight performance.
- The tasks are scheduled to run every day at 9 a.m Paris/Timezone using CRON.
- A daily query is performed on the SQLite3 database to extract the necessary data for analysis.
- The
Pandas and
Matplotlib` frameworks are used to create visual representations of the data, such as plots and charts. - The
Pillow
package is used to generate a daily report using the visualizations created in the previous step.
This process ensures that the report is always up-to-date, providing the most current information on Tunisair flight performance. The report will be easy to understand, as it is accompanied with visual representation of data.
- Once the daily report is generated, a tweet is automatically posted to the @Tunisairalert account, providing real-time updates on Tunisair's flight performance to followers.
- The tweet will include a summary of the key performance indicators and a link to the full report for those who want to dive deeper into the data.
This allows for easy dissemination of the report to a wider audience, and also allows for real-time monitoring of Tunisair's performance. The transparency of this process will make it easy for stakeholders to stay informed about the airline's performance.
- Since the script will be hosted on a personal server using
FreeBSD
, a FTP script is made to update local.db
data - CRON JOB for
api_job.py
0 0,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23 * * * root python daily_cron.py
- CRON JOB for
twitter_job.py
0 9 * * * root python twitter_job.py
- You need to obtain a token from airlabs.co and add it to the
.env
file located in the root directory of the project. - You also need to obtain Twitter API codes and add them to the
.env
file. See tutorial and past the information in.env
The .env
file will loke like this
consumer_key=
consumer_secret=
access_token=
access_token_secret=
path=
file_name=tunisair_delay.db
ip_adress=
login=
password=
token_airlab=
📁|- data-analysis : containing all pandas, and matplotlib features
📁|- data-pipeline : containing the api requests, sql queries and the table.db
📁|- src : containing media, utils and consts
📁|- test : containing some function test
🐍.api_job.py : will be used daily for data scrapping
🐍.post_to_twitter.py : will be used daily to post on twitter
- Install the packages in
requirements.txt
api_job.py
is the module that will ingest the data from Airlabs APItwitter_job.py
is the module that will post the report on Twitter
---
You can check out the full license here
This project is open source and has no buisness intent.