<a href="https://colab.research.google.com/github/comparativechrono/Principles-of-Data-Science/blob/main/Week_7/Section_10_Python_Example__Creating_Reports_with_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Section 10: python example- creating reports with python

In data science, the ability to communicate findings effectively is as critical as the technical capacity to perform data analysis. Python offers several libraries that can be used to create comprehensive reports combining narratives, analyses, visualizations, and interactive elements. This section demonstrates how to use Python for creating dynamic reports that effectively communicate the insights derived from data.

1. Setting Up the Environment:

To create rich, interactive reports in Python, you may need to use several libraries. For static visualizations, matplotlib and seaborn are excellent choices, while plotly and dash provide interactive capabilities. For report generation, Jupyter Notebooks can be converted into shareable formats using nbconvert, and pandas provides excellent data manipulation capabilities. Install these libraries if they are not already present:

In [None]:
pip install matplotlib seaborn plotly dash jupyter pandas

2. Importing Required Libraries:

Import the necessary libraries to handle data manipulation, visualization, and the creation of interactive elements:

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

3. Preparing the Data:

For this example, let's assume we have a dataset containing sales and marketing data which we want to analyse and report:

In [None]:
# Create a sample DataFrame
data = pd.DataFrame({ 'Month': ['January', 'February', 'March', 'April', 'May', 'June'], 'Sales': [242, 250, 215, 275, 290, 310], 'Marketing Spend': [20, 25, 18, 22, 27, 30] })

4. Creating Visualizations:

Generate visualizations that will form part of the report:

In [None]:
# Line plot for sales data
plt.figure(figsize=(10, 5))
plt.plot(data['Month'], data['Sales'], marker='o')
plt.title('Monthly Sales Data')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.grid(True)
plt.savefig('monthly_sales.png')
plt.show()
# Interactive plot using Plotly
fig = px.bar(data, x='Month', y='Marketing Spend', title='Monthly Marketing Spend')
fig.write_html('monthly_marketing_spend.html')
fig.show()

5. Writing the Report:

We could just use a Jupyter Notebook to combine narrative, code, and plots into a cohesive report. But for a professional finish, write your report into a PDF. The next section goes through how to do this in a reliable fashion.

## Enhancing Python Reports with PDF Generation

Creating PDF reports directly from Python scripts provides a professional and portable format for distributing analyses and insights. A popular tool for generating PDFs in Python is ReportLab, which allows for extensive customization and embedding of images and text. Here’s how you can enhance your reporting capabilities by including the ability to generate PDF reports directly from your Python analysis scripts.

1. Setting Up the Environment:

To generate PDF reports, you'll need to install the ReportLab library, which is the leading toolkit for PDF generation in Python:

In [None]:
pip install reportlab

2. Importing Required Libraries:

Along with ReportLab, ensure you have Pandas for data manipulation and Matplotlib for creating plots that you will embed in the PDF:

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter

3. Preparing Data and Visualizations:

As before, let's use a sample dataset to perform analyses and create visualizations:

In [None]:
# Sample DataFrame
data = pd.DataFrame({ 'Month': ['January', 'February', 'March', 'April', 'May', 'June'], 'Sales': [242, 250, 215, 275, 290, 310], 'Marketing Spend': [20, 25, 18, 22, 27, 30] })
# Creating a plot
plt.figure(figsize=(8, 4))
plt.plot(data['Month'], data['Sales'], marker='o')
plt.title('Monthly Sales Trends')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.tight_layout()
plot_path = 'sales_plot.png'
plt.savefig(plot_path)
plt.close()

4. Generating PDF Report:

Use ReportLab to create a PDF report that includes the descriptive statistics and the saved plot:

In [None]:
# Create a PDF with ReportLab
pdf_path = 'Monthly_Report.pdf'
c = canvas.Canvas(pdf_path, pagesize=letter)
width, height = letter # Get the dimensions of the page
# Add a title
c.setFont("Helvetica-Bold", 16)
c.drawCentredString(width / 2.0, height - 50, "Sales and Marketing Report")
# Insert the plot
c.drawImage(plot_path, 100, height - 450, width=400, preserveAspectRatio=True, mask='auto')
# Add some descriptive statistics
c.setFont("Helvetica", 12)
c.drawString(100, height - 500, f"Mean Sales: {data['Sales'].mean():.2f}")
c.drawString(100, height - 525, f"Total Marketing Spend: {data['Marketing Spend'].sum()}")
# Save the PDF
c.showPage()
c.save()

5. Automating and Distributing the Report:

For ongoing reports, you could automate this script to run at specified intervals (e.g., monthly) using a task scheduler like cron (Linux/Mac) or Task Scheduler (Windows). The resulting PDF can then be automatically distributed via email using SMTP.

6. Conclusion:

Integrating PDF generation into your Python reporting scripts adds a layer of professionalism and enhances the accessibility of your reports. Using ReportLab allows for custom report designs that can be tailored to meet various formatting and branding requirements. This method ensures that stakeholders receive timely, visually appealing, and information-rich reports that aid strategic decision-making and provide archival-quality documentation of analytical insights. By leveraging Python's powerful ecosystem, including libraries like Pandas, Matplotlib, and ReportLab, data scientists can streamline the end-to-end process of data analysis, reporting, and communication.