Skip to content

manish-jsx/data-pipeline-php-airflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

1. Data Pipeline:

  • GitHub Repo Name: data-pipeline-php-airflow
  • MVP Goal: Simple data extraction from a CSV file, transformation, and loading into a MySQL table, scheduled with Airflow.
  • Key Components:
    • PHP Script:
      • Extract data from a local CSV file.
      • Transform data (basic data cleaning, e.g., trimming whitespace, basic data type conversions).
      • Load data into a MySQL table.
    • MySQL Database:
      • A simple target table.
    • Apache Airflow:
      • A single DAG to run the PHP ETL script.
  • Steps:
    1. Set up MySQL: Create a database and table.
    2. Create PHP Script: Implement basic ETL logic (using a basic csv file).
    3. Dockerize Airflow: Start Airflow using a docker-compose setup with a basic configuration.
    4. Create an Airflow DAG: Write a simple DAG in Python to run the PHP script.
    5. Run the DAG: Ensure the pipeline runs successfully.
    6. Documentation: Add README with setup and usage instructions.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published