<a href="https://colab.research.google.com/github/ElhassanGitUub/PyProj/blob/main/Stacked_Charts.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Stacked Charts
Stacked Chart of Median JobSatPoints_6 and JobSatPoints_7 for Different Age Groups

1- Visualize the composition of job satisfaction scores (JobSatPoints_6 and JobSatPoints_7) across various age groups. This will help in understanding the breakdown of satisfaction levels across different demographics.


2- Stacked Chart of JobSatPoints_6 and JobSatPoints_7 for Employment Status
Create a stacked chart to compare job satisfaction (JobSatPoints_6 and JobSatPoints_7) across different employment statuses. This will show how satisfaction varies by employment type.



In [None]:
!wget https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/QR9YeprUYhOoLafzlLspAw/survey-results-public.sqlite
import sqlite3
import pandas as pd
import matplotlib.pyplot as plt

# Connexion à la base SQLite
conn = sqlite3.connect('survey-results-public.sqlite')

# Charger les données
QUERY = "SELECT * FROM main"
df = pd.read_sql_query(QUERY, conn)

# Vérifier la présence des colonnes nécessaires
required_columns = {'Age', 'JobSatPoints_6', 'JobSatPoints_7', 'Employment'}
if not required_columns.issubset(df.columns):
    print(f"Certaines colonnes manquent : {required_columns - set(df.columns)}")
else:
    ### 1. Stacked Chart de JobSatPoints par groupes d'âge ###

    # Convertir Age en numérique et créer des groupes d'âge
    df['Age'] = pd.to_numeric(df['Age'], errors='coerce')
    bins = [18, 25, 35, 45, 55, 65, 100]
    labels = ['18-24', '25-34', '35-44', '45-54', '55-64', '65+']
    df['AgeGroup'] = pd.cut(df['Age'], bins=bins, labels=labels, right=False)

    # Calculer la médiane de JobSatPoints_6 et JobSatPoints_7 par groupe d'âge
    age_satisfaction = df.groupby('AgeGroup')[['JobSatPoints_6', 'JobSatPoints_7']].median().dropna()

    # Tracer le graphique empilé
    age_satisfaction.plot(kind='bar', stacked=True, figsize=(10, 6), colormap='viridis')
    plt.title('Stacked Chart of Job Satisfaction Scores by Age Group')
    plt.xlabel('Age Group')
    plt.ylabel('Median Job Satisfaction Score')
    plt.legend(title='Job Satisfaction Metrics')
    plt.xticks(rotation=0)
    plt.grid(axis='y', linestyle='--', alpha=0.7)
    plt.show()

    ### 2. Stacked Chart de JobSatPoints par statut d'emploi ###

    # Filtrer les données et calculer la médiane par statut d'emploi
    employment_satisfaction = df.groupby('Employment')[['JobSatPoints_6', 'JobSatPoints_7']].median().dropna()

    # Tracer le graphique empilé
    employment_satisfaction.plot(kind='bar', stacked=True, figsize=(10, 6), colormap='plasma')
    plt.title('Stacked Chart of Job Satisfaction Scores by Employment Status')
    plt.xlabel('Employment Status')
    plt.ylabel('Median Job Satisfaction Score')
    plt.legend(title='Job Satisfaction Metrics')
    plt.xticks(rotation=45, ha='right')
    plt.grid(axis='y', linestyle='--', alpha=0.7)
    plt.show()

# Fermer la connexion
conn.close()


### Task 3: Comparing Data Using Stacked Charts

1. Stacked Chart of Preferred Databases by Age Group
Visualize the top databases that respondents from different age groups wish to learn. Create a stacked chart to show the proportion of each database in each age group.

2. Stacked Chart of Employment Type by Job Satisfaction
Analyze the distribution of employment types within each job satisfaction level using a stacked chart. This will provide insights into how employment types are distributed across various satisfaction ratings.

In [None]:
!wget https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/QR9YeprUYhOoLafzlLspAw/survey-results-public.sqlite
import sqlite3
import pandas as pd
import matplotlib.pyplot as plt

# Connexion à la base SQLite
conn = sqlite3.connect('survey-results-public.sqlite')

# Charger les données
QUERY = "SELECT * FROM main"
df = pd.read_sql_query(QUERY, conn)

# Charger les données (Assurez-vous que df contient 'Age' et 'DatabaseWantToWorkWith')
# df = pd.read_csv('survey-data.csv')

# 1. Stacked Chart of Preferred Databases by Age Group

# Définition des groupes d'âge
df['Age'] = pd.to_numeric(df['Age'], errors='coerce')
bins = [18, 25, 35, 45, 55, 65, 100]
labels = ['18-24', '25-34', '35-44', '45-54', '55-64', '65+']
df['AgeGroup'] = pd.cut(df['Age'], bins=bins, labels=labels, right=False)

# Séparer et compter les bases de données préférées par groupe d'âge
df_exploded = df[['AgeGroup', 'DatabaseWantToWorkWith']].dropna()
df_exploded = df_exploded.assign(Database=df_exploded['DatabaseWantToWorkWith'].str.split(';')).explode('Database')

# Sélectionner les 5 bases de données les plus populaires
top_databases = df_exploded['Database'].value_counts().nlargest(5).index
df_filtered = df_exploded[df_exploded['Database'].isin(top_databases)]

# Créer un tableau croisé dynamique
pivot_table = df_filtered.pivot_table(index='AgeGroup', columns='Database', aggfunc='size', fill_value=0)

# Tracer le graphique empilé
pivot_table.plot(kind='bar', stacked=True, figsize=(10, 6), colormap='tab10')
plt.title("Stacked Chart des Bases de Données Préférées par Groupe d’Âge")
plt.xlabel("Groupe d'Âge")
plt.ylabel("Nombre de Répondants")
plt.legend(title="Base de Données", bbox_to_anchor=(1.05, 1), loc='upper left')
plt.xticks(rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()


### Task 4: Exploring Technology Preferences Using Stacked Charts
1. Stacked Chart for Preferred Programming Languages by Age Group
Analyze how programming language preferences (LanguageAdmired) vary across age groups.


2. Stacked Chart for Technology Adoption by Employment Type
Explore how admired platforms (PlatformAdmired) differ across employment types (e.g., full-time, freelance)



In [4]:
import pandas as pd
import matplotlib.pyplot as plt

# Charger les données (Assurez-vous que df contient 'Age' et 'LanguageAdmired')
# df = pd.read_csv('survey-data.csv')

# Définition des groupes d'âge
bins = [18, 25, 35, 45, 55, 65, 100]
labels = ['18-24', '25-34', '35-44', '45-54', '55-64', '65+']
df['AgeGroup'] = pd.cut(df['Age'], bins=bins, labels=labels, right=False)

# Séparer et compter les langages de programmation préférés par groupe d'âge
df_exploded = df[['AgeGroup', 'LanguageAdmired']].dropna()
df_exploded = df_exploded.assign(Language=df_exploded['LanguageAdmired'].str.split(';')).explode('Language')

# Sélectionner les 5 langages les plus populaires
top_languages = df_exploded['Language'].value_counts().nlargest(5).index
df_filtered = df_exploded[df_exploded['Language'].isin(top_languages)]

# Créer un tableau croisé dynamique
pivot_table = df_filtered.pivot_table(index='AgeGroup', columns='Language', aggfunc='size', fill_value=0)

# Tracer le graphique empilé
pivot_table.plot(kind='bar', stacked=True, figsize=(10, 6), colormap='tab10')
plt.title("Stacked Chart des Langages de Programmation Préférés par Groupe d’Âge")
plt.xlabel("Groupe d'Âge")
plt.ylabel("Nombre de Répondants")
plt.legend(title="Langage", bbox_to_anchor=(1.05, 1), loc='upper left')
plt.xticks(rotation=0)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

# Charger les données (Assurez-vous que df contient 'Employment' et 'PlatformAdmired')

# Supprimer les valeurs manquantes
df_filtered = df[['Employment', 'PlatformAdmired']].dropna()

# Séparer et compter les plateformes admirées par type d'emploi
df_exploded = df_filtered.assign(Platform=df_filtered['PlatformAdmired'].str.split(';')).explode('Platform')

# Sélectionner les 5 plateformes les plus populaires
top_platforms = df_exploded['Platform'].value_counts().nlargest(5).index
df_filtered = df_exploded[df_exploded['Platform'].isin(top_platforms)]

# Créer un tableau croisé dynamique
pivot_table = df_filtered.pivot_table(index='Employment', columns='Platform', aggfunc='size', fill_value=0)

# Tracer le graphique empilé
pivot_table.plot(kind='bar', stacked=True, figsize=(10, 6), colormap='viridis')
plt.title("Stacked Chart des Plateformes Admirées par Type d'Emploi")
plt.xlabel("Type d'Emploi")
plt.ylabel("Nombre de Répondants")
plt.legend(title="Plateforme", bbox_to_anchor=(1.05, 1), loc='upper left')
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()


  pivot_table = df_filtered.pivot_table(index='AgeGroup', columns='Language', aggfunc='size', fill_value=0)


TypeError: no numeric data to plot

### Final Step: Review
In this lab, you focused on using stacked charts to understand the composition and comparison within the dataset. Stacked charts provided insights into job satisfaction, compensation, and preferred databases across age groups and employment types.

##Summary


After completing this lab, you will be able to:

Use stacked charts to analyze the composition of data across categories, such as job satisfaction and compensation by age group.

Compare data across different dimensions using stacked charts, enhancing your ability to communicate complex relationships in the data.

Visualize distributions across multiple categories, such as employment type by satisfaction, to gain a deeper understanding of patterns within the dataset.