<a href="https://colab.research.google.com/github/Odero254/intro_to_github/blob/main/%5BPractice_Notebook%5D_AfterWork_Data_Visualization_for_Healthcare_with_Matplotlib.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# [Practice Notebook] AfterWork: Data Visualization for Healthcare with Matplotlib

# Pre-requisite

In [None]:
# Import Pandas for data manipulation
import pandas as pd

# Import Matplotlib for visualization
import matplotlib.pylab as plt

# 1. Creating basic plots



## 1.1 Creating bar charts


Creating bar charts in Matplotlib allows us to visually represent data using rectangular bars with lengths proportional to the values they represent. This is important for comparing different categories or groups of data at a glance.



In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_8t5lp.csv")

# Preview the dataset
data.head()

In [None]:
# Create a bar chart
plt.figure(figsize=(10,6))
plt.bar(data['Medication'], data['Quantity'], color='skyblue')

# Style the plot
plt.xlabel('Medication')
plt.ylabel('Quantity')
plt.title('Quantity of Medication in Healthcare and Pharmaceuticals')
plt.xticks(rotation=45)

# Show the plot
plt.show()

### Challenge

Create a bar chart using Matplotlib to visualize the Quantity of each Medication in the dataset from https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_7wt3.csv. Remember to label the x-axis with Medication names and the y-axis with Quantity values.


In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_7wt3.csv")

# Create a bar chart
# Write your code here


# Style the plot (xlabel, ylabel, title, xticks)
# Write your code here


# Show the plot
# Write your code here



## 1.2 Creating line plots


Creating line plots allows us to visualize trends and patterns in our healthcare data. Line plots are useful for showing how a variable changes over time or across different categories. They are particularly effective for displaying continuous data points in a sequential order.



In [None]:
# Load the dataset
df = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_jtezr.csv")

# Preview the dataset
df.head()

In [None]:
# Convert Blood Pressure to systolic and diastolic columns
df[['Systolic_BP', 'Diastolic_BP']] = df['Blood_Pressure'].str.split('/', expand=True)

# Convert Visit_Date to datetime format
df['Visit_Date'] = pd.to_datetime(df['Visit_Date'])

# Preview the dataset
df.head()

In [None]:
# Create a line plot for Heart Rate over time
plt.figure(figsize=(10, 6))
plt.plot(df['Visit_Date'], df['Heart_Rate'], marker='o', linestyle='-', linewidth=2)

# Style the plot
plt.xlabel('Visit Date')
plt.ylabel('Heart Rate')
plt.title('Heart Rate Over Time')
plt.grid(True)
plt.show()

### Challenge

Create a line plot to visualize the Heart Rate data over time for the patients in the dataset located at https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_m04x.csv.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_m04x.csv")
data.head()

In [None]:
# Create a line plot to visualize the Heart Rate data over time
plt.figure(figsize=(10, 6))
# Write your code here

# Add labels and title (xlabel, ylabel, title, grid)
# Write your code here


# Display the plot
# Write your code here


## 1.3 Creating scatter plots


Creating scatter plots allows us to visualize the relationship between two variables in a dataset. Scatter plots are useful for identifying patterns, trends, and outliers in the data.



In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_e7m45.csv")

# Preview the dataset
data.head()

In [None]:
# Create a scatter plot of Age vs Weight
plt.scatter(data['Age'], data['Weight'])

# Style the plot
plt.xlabel('Age')
plt.ylabel('Weight')
plt.title('Age vs Weight Scatter Plot')
plt.show()

### Challenge

Create a scatter plot to visualize the relationship between Drug Effectiveness and Side Effects using the dataset from the URL: https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_re24.csv.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_re24.csv")
data.head()

In [None]:
# Create a scatter plot of Drug Effectiveness vs Side Effects


# Style the plot (xlable, ylabel, title)


# Display the plot



## 1.4 Creating histograms


Histograms in data visualization are used to represent the distribution of a continuous variable. They consist of a series of bars that show the frequency of data within certain ranges. Histograms are important because they allow us to easily identify patterns, trends, and outliers in our data. To create a histogram, we first need to determine the number of bins (or intervals) to divide our data into. We then plot the bars, where the height of each bar represents the frequency of data points within that bin. Finally, we add labels and titles to make the histogram more informative.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_z7psk.csv")

# Preview the dataset
data.head()

In [None]:
# Create the histogram
plt.hist(data['Age'], bins=10, color='skyblue', edgecolor='black')

# Style the plot
plt.xlabel('Age')
plt.ylabel('Frequency')
plt.title('Distribution of Patient Ages')
plt.show()

### Challenge


Create a histogram to visualize the distribution of patient ages in the healthcare dataset from the URL: https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_mrhw.csv.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_mrhw.csv")
data.head()

In [None]:
# Create the histogram


# Add labels and title (xlabel, ylabel, title)



# Display the histogram


## 1.5 Creating pie charts


Pie charts in data visualization are circular statistical graphics that are divided into slices to illustrate numerical proportions. We use pie charts when we want to show the relative sizes of different categories in a dataset. For example, we can use a pie chart to visualize the distribution of different types of diseases in a population.

In [None]:
# Import the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_qwmtd.csv")

# Preview the dataset
data.head()

In [None]:
# Create the pie chart
plt.figure(figsize=(8,8))
plt.pie(data['Quantity'], labels=data['Medication'], autopct='%1.1f%%', startangle=140)

# Style the plot
plt.title("Distribution of Medications")
plt.legend(loc="upper right")
plt.axis('equal')
plt.show()

### Challenge


Create a pie chart to visualize the distribution of medication quantities in the dataset from https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_rsej.csv.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_rsej.csv")
data.head()

In [None]:
# Write your code here

# Set the figure size to (8,8)


# Create the plot (set startangle to 1.40)


# Style the chart (title, legend, axix)



# Display the chart



# 2. Customizing plots

## 2.1 Setting the figure size


Setting the figure size allows us to control the dimensions of our charts, ensuring they are displayed in the desired size. This is important for creating visually appealing and informative data visualizations. We can adjust the figure size based on the specific requirements of our project, such as fitting the chart into a report or presentation slide, or optimizing it for viewing on different devices.



In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_lna9f.csv")

# Preview the dataset
data.head()

In [None]:
# Set the figure size
plt.figure(figsize=(8, 6))

# Create the scatter plot
plt.scatter(data['Age'], data['Dosage'])

# Style the plot
plt.xlabel('Age')
plt.ylabel('Dosage')
plt.title('Age vs Dosage Scatterplot')
plt.show()

### Challenge

Create a bar chart using the dataset from the URL: https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_bp5x.csv. Set the figure size to 8 inches in width and 6 inches in height. Remember to label the x-axis with 'Medication' and the y-axis with 'Price'.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_bp5x.csv")
data.head()

In [None]:
# Set the figure size


# Create a bar chart (Medication vs. Price)


# Style the plot
plt.xlabel('Medication')
plt.ylabel('Price')
plt.title('Medication vs Price Bar Chart')
plt.show()

## 2.2 Customizing colors and markers


Customizing colors and markers allows us to personalize the appearance of our plots. This helps us to make our visualizations more visually appealing and easier to interpret.



In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_o9c45.csv")

# Preview the dataset
data.head()

In [None]:
# Create a scatter plot with customized colors and markers
plt.scatter(data['Age'], data['Diagnosis'], c='red', marker='^')

# Style the plot
plt.xlabel('Age')
plt.ylabel('Diagnosis')
plt.title('Patient Age vs Diagnosis')
plt.show()

### Challenge

Create a Python program that visualizes the 'Color' and 'Shape' attributes of the healthcare and pharmaceutical dataset from the URL: https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_g5cn.csv using Matplotlib.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_g5cn.csv")
data.head()

In [None]:
# Create a scatter plot with customized colors and markers (set c and marker parameter values to 'blue' and 'o')


# Style the chart
plt.xlabel('Color')
plt.ylabel('Shape')
plt.title('Color vs Shape in Healthcare and Pharmaceuticals Dataset')
plt.show()

## 2.3 Customizing annotations

Customizing annotations based on data points allows us to add additional information to our charts by highlighting specific data points with text or shapes. This helps to draw attention to important data points and provide context to the viewer.



In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_ezsug.csv")

# Preview the dataset
data.head()

In [None]:
# Displayig the range
data_range = range(len(data))
data_range

In [None]:
# Create a scatter plot
plt.scatter(data['Age'], data['Dosage'])

# Add annotations for specific data points
for i in range(len(data)):
    # Annotate: Text annotation, cordinates, coordinates for annotation, the coordinate system, location of annocation, ha = orientation, fontsize
    plt.annotate(data['Medication'][i], (data['Age'][i], data['Dosage'][i]), textcoords="offset points", xytext=(0,10), ha='center', fontsize=9)

# Style the plot
plt.xlabel('Age')
plt.ylabel('Dosage')
plt.title('Dosage vs Age for Medication')
plt.show()

### Challenge

Create a bar chart using the dataset from the URL: https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_swak.csv. Customize the annotations to highlight the Dosage of each medication. Remember to use the 'Dosage' data points to add additional information to the chart.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_swak.csv")
data.head()

In [None]:
# Create a bar chart
plt.bar(data['Medication'], data['Dosage'])

# Add annotations for specific data points


# Style the plot
plt.xlabel('Medication')
plt.ylabel('Dosage')
plt.title('Dosage for each Medication')
plt.xticks(rotation=45)

# Display the plot
plt.show()

## 2.4 Customizing Color Palettes

Customizing color palettes in data visualization allows us to choose specific colors for our charts and graphs. This helps us convey information more effectively, make our visualizations more visually appealing, and ensure color consistency across different visualizations.



In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_emur1.csv")

# Preview the dataset
data.head()

In [None]:
# Customizing color palette
colors = ['#FF6347', '#1E90FF', '#32CD32', '#FFD700', '#BA55D3', '#FFA07A']

# Plotting a bar chart with customized colors
plt.figure(figsize=(10, 6))
plt.bar(data['Name'], data['Age'], color=colors)

# Style the plot
plt.xlabel('Patient Name')
plt.ylabel('Age')
plt.title('Age Distribution of Patients')
plt.xticks(rotation=45)
plt.show()

### Challenge

Create a bar chart using the dataset from the URL: https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_7jtz.csv. Customize the color palette of the chart to represent each medication with a different color.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_7jtz.csv")
data.head()

In [None]:
# Customizing color palette ('#FF6347', '#1E90FF', '#32CD32', '#FFD700', '#BA55D3', '#FFA07A')
# Get hex color codes here: https://www.color-hex.com


# Set the figure size
plt.figure(figsize=(10, 6))

# Create a bar chart with customized colors
plt.bar(data['Medication'], data['Quantity'], color=colors)

# Customize the plot
plt.xlabel('Medication')
plt.ylabel('Quantity')
plt.title('Quantity of Medication Prescribed')
plt.xticks(rotation=45)

# Display the plot
plt.show()

## 2.5 Styling legends

Styling legends in data visualization allows us to customize the appearance of the legend in our charts. Legends are essential for providing context to the data being displayed and making it easier for viewers to interpret the information. By styling legends, we can make them more visually appealing and easier to understand, enhancing the overall presentation of our charts.



In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_dszgy.csv")

# Preview the dataset
data.head()

In [None]:
# Create a bar chart
plt.bar(data['Name'], data['Age'], label='Age')

# Customize the legend
plt.legend(title='Patient Information', title_fontsize='large', loc='upper left', shadow=True)

# Style the plot
plt.xticks(rotation=45)
plt.show()

### Challenge

Create a bar chart visualization using the dataset from the URL: https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_0ps9.csv. The chart should display the Frequency of each Medication.

In [None]:
# Load the data
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_0ps9.csv")
data.head()

In [None]:
# Create a bar chart (Medication vs. Frequency) and set the label parameter


# Customize the legend (title: Medication Frequency, title font size: large, location: upper left, shadow: true)
plt.legend(title='Medication Frequency', title_fontsize='large', loc='upper left', shadow=True)

# Style the plot
plt.xticks(rotation=45)

# Display the plot
plt.show()

# 3. Layouts

## 3.1 Arranging subplots in a grid layout

Arranging subplots in a grid layout allows us to display multiple plots in a structured and organized manner within a single figure. This helps us compare different visualizations side by side and analyze relationships between them more effectively.



In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_hg05w.csv")

# Preview the dataset
data.head()

In [None]:
# Create subplots in a 1x2 grid layout
fig, axs = plt.subplots(1, 2, figsize=(10, 5))

# Plot Age distribution
axs[0].hist(data['Age'], bins=10, color='skyblue')
axs[0].set_title('Age Distribution')

# Plot Gender distribution
data['Gender'].value_counts().plot(kind='bar', ax=axs[1], color='salmon')
axs[1].set_title('Gender Distribution')

plt.tight_layout()
plt.show()

### Challenge

Create a grid layout of subplots to display the Quantity and Price of different medications from the dataset located at https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_aysk.csv. Ensure that the Quantity is displayed in one subplot and the Price is displayed in another subplot. Compare the visualizations side by side to analyze any relationships between the Quantity and Price of the medications.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_aysk.csv")

# Create subplots in a 1x2 grid layout


# Plot Quantity distribution


# Plot Price distribution


## 3.2 Shared axes

Sharing axes in data visualization refers to using the same x or y axis across multiple plots in order to compare data more easily. This concept is important because it allows us to visually compare different datasets on the same scale, making it easier to identify patterns and trends. To share axes in Matplotlib, we can use the sharex or sharey parameters when creating subplots. By setting these parameters to True, we can ensure that all subplots share the same x or y axis, respectively.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_t5z2p.csv")

# Preview the dataset
data.head()

In [None]:
# Create subplots with shared x-axis
fig, axs = plt.subplots(2, 1, figsize=(6, 6), sharex=True)

# Plot Age vs Insurance Coverage
axs[0].scatter(data['Age'], data['Insurance Coverage'])
axs[0].set_title('Age vs Insurance Coverage')

axs[0].set_ylabel('Insurance Coverage')

# Plot Age vs Medication
axs[1].scatter(data['Age'], data['Medication'])
axs[1].set_title('Age vs Medication')
axs[1].set_xlabel('Age')
axs[1].set_ylabel('Medication')

plt.tight_layout()
plt.show()

### Challenge

Create a Python program that reads in the dataset from the URL: https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_jbao.csv and creates two subplots to visualize the Quantity and Price of different medications.

In [None]:
# Write your code here


## 3.3 Adjusting plot spacing

Adjusting plot spacing in Matplotlib allows us to control the distance between plots in a layout. This is important for creating visually appealing and organized data visualizations. By adjusting plot spacing, we can prevent overlapping plots and make it easier for viewers to interpret the data.

In [None]:
# Load the dataset
data = pd.read_csv("https://afterwork.ai/ds/e/healthcare_and_pharmaceuticals_5j7zg.csv")

# Preview the dataset
data.head()

In [None]:
# Create subplots with adjusted spacing
fig, axs = plt.subplots(2, 2, figsize=(6, 6))
plt.subplots_adjust(wspace=0.5, hspace=0.5)

# Plot data
axs[0, 0].scatter(data['Age'], data['Diagnosis'])
axs[0, 1].bar(data['Gender'], data['Dosage'])
axs[1, 0].hist(data['Age'])
axs[1, 1].pie(data['Gender'].value_counts(), labels=data['Gender'].unique())

plt.show()

### Challenge

Create a data visualization using Matplotlib to plot the Quantity of each Medication from the dataset located at https://afterwork.ai/ds/ch/healthcare_and_pharmaceuticals_e6tk.csv. Adjust the plot spacing to ensure there is no overlap between the bars.

In [None]:
# Write your code here
