Data-pipeline-python-sql-tableau

End-to-end data analytics project using Python, SQL Server, and Tableau for ETL, storage, and visualization.

End-to-End Data Analytics Project using Python, SQL Server & Tableau

This project demonstrates an end-to-end data analytics workflow — from data extraction and transformation using Python, to storage and querying with SQL Server, and finally data visualization using Tableau. It aims to generate meaningful business insights through automation and reporting.

📁 Project Structure

Data-pipeline-python-sql-tableau/
│
├── data/               # Raw or sample datasets
├── python/             # Python scripts for ETL
├── sql/                # SQL scripts (DDL/DML queries)
├── tableau/            # Tableau workbooks (.twb/.twbx files or screenshots)
├── README.md           # Project documentation
└── requirements.txt    # Python dependencies

💻 Technologies Used

• Python (Pandas, SQLAlchemy/pyodbc) – for data extraction, transformation (ETL)

• SQL Server – for data storage and querying

• Tableau – for building interactive dashboards

• Jupyter Notebook – for EDA and testing

✨ Features

• Automated ETL using Python

• Cleaned and normalized data stored in SQL Server

• Interactive dashboards in Tableau

• Real-world dataset (customizable)

• Modular codebase and reusable scripts

🚀 How to Run

1. Clone the Repository

git clone https://github.com/Mahpara810/Data-pipeline-python-sql-tableau
cd Data-pipeline-python-sql-tableau

2. Set Up the Environment

pip install -r requirements.txt

3. Data Source

The dataset used in this project is publicly available on Kaggle:

Telco Customer Churn Dataset

4 Run the ETL and Data Modeling Notebook

• Open the Jupyter Notebook:
notebooks/Telecom-customer-churn.ipynb

• This notebook performs:

Data loading and cleaning
Transformation into a star schema (fact and dimension tables)
Uploading the transformed data into the existing tables in the SQL Server database.

• Ensure your SQL Server connection details (host, database, username, password) are correctly configured in the notebook.

• After connecting, verify that data of the fact and dimension tables have been successfully loaded into SQL Server.

5. Set Up SQL Server

• Create a new database.

• Run the SQL scripts in the sql/ folder to create the required tables.

• Make a connection to SQL Server from the Jupyter notebook using pyodbc or sqlalchemy.

• Push the cleaned and transformed data from the notebook into the SQL Server tables using pandas.to_sql() or SQL queries.

6. Open Tableau Dashboard

• Open the .twb or .twbx file from the tableau/ folder using Tableau.

• Connect to your SQL Server database, or use the extracted data embedded in the workbook.

7. Explore the Dashboards!

• Explore and analyze the visualized data insights to uncover trends, patterns, and key metrics.

🧠 Business Problem

• How many customers have churned, and what is the overall churn rate?

• Which customer groups (by gender, internet service type, and contract type) are most likely to churn?

• How does the type of internet service (DSL, Fiber Optic, or None) affect customer churn rates?

• Are customers on month-to-month contracts more likely to leave compared to one- or two-year contracts?

• Which payment methods are most commonly used, and do they influence customer retention?

• How much revenue is generated at different stages of a customer's tenure?

• Who are the top 10 highest-spending customers based on average monthly charges?

• Which type of internet service is the most popular among customers?

• What is the average customer tenure, and how does it relate to loyalty and churn?

• How many senior citizens live alone, and could this impact their likelihood of churning?

📊 Dashboard

✅ Use Cases

This template can be adapted for:

• Sales Performance Analysis

• Customer Segmentation

• Financial Reporting

• Healthcare Data Analysis

• Marketing Campaign Insights

📚 Resources and References

1. Dataset Source:

Telco Customer Churn Dataset This dataset, provided by IBM Sample Data Sets, includes information about a telecom company's customers, such as services signed up for, customer account information, and whether or not the customer churned. It's commonly used for customer retention and churn prediction modeling.

2. Official Library Documentation:

• Pandas Documentation

• NumPy Documentation

• Matplotlib Documentation

• Seaborn Documentation

• Scikit-learn Documentation

📘 Additional Learning Resources (if applicable):

• IBM Developer – Predict customer churn with SciKit-Learn

• Medium Article – Customer Churn Prediction with Telco Dataset

🌟 Acknowledgments

• Thanks to IBM Sample Data Sets for providing the Telco Customer Churn dataset.

• Special thanks to Kaggle for hosting and making the dataset easily accessible to the data science community.

• Gratitude to the open-source community behind tools like Pandas, NumPy, Matplotlib and Seaborn.

• Appreciation to the authors of various tutorials and blog posts that helped guide the data exploration and model building process.

🤝Contributions

Contributions are welcome! Please fork the repo, make your changes, and submit a pull request.

📝 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
README.md		README.md
SQL-connection-with-jupyterNotebook.sql		SQL-connection-with-jupyterNotebook.sql
Star-Schema.PNG		Star-Schema.PNG
Telecom-Customer-Churn.sql		Telecom-Customer-Churn.sql
Telecom-customer-churn-dashboard.twb		Telecom-customer-churn-dashboard.twb
Telecom-customer-churn.ipynb		Telecom-customer-churn.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data-pipeline-python-sql-tableau

End-to-End Data Analytics Project using Python, SQL Server & Tableau

📁 Project Structure

💻 Technologies Used

✨ Features

🚀 How to Run

1. Clone the Repository

2. Set Up the Environment

3. Data Source

4 Run the ETL and Data Modeling Notebook

5. Set Up SQL Server

6. Open Tableau Dashboard

7. Explore the Dashboards!

🧠 Business Problem

📊 Dashboard

✅ Use Cases

📚 Resources and References

1. Dataset Source:

2. Official Library Documentation:

📘 Additional Learning Resources (if applicable):

🌟 Acknowledgments

🤝Contributions

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Mahpara810/Data-pipeline-python-sql-tableau

Folders and files

Latest commit

History

Repository files navigation

Data-pipeline-python-sql-tableau

End-to-End Data Analytics Project using Python, SQL Server & Tableau

📁 Project Structure

💻 Technologies Used

✨ Features

🚀 How to Run

1. Clone the Repository

2. Set Up the Environment

3. Data Source

4 Run the ETL and Data Modeling Notebook

5. Set Up SQL Server

6. Open Tableau Dashboard

7. Explore the Dashboards!

🧠 Business Problem

📊 Dashboard

✅ Use Cases

📚 Resources and References

1. Dataset Source:

2. Official Library Documentation:

📘 Additional Learning Resources (if applicable):

🌟 Acknowledgments

🤝Contributions

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages