## Data analysis for employment in non-profit organization

Divided by four datasets instead of two datasets and then later into another two datasets.

Import all requirement,

In [5]:
import pandas as pd
import numpy as np
import ydata_profiling as pp  
from ydata_profiling import ProfileReport 
import warnings
import os

warnings.filterwarnings('ignore')

Import unemployment dataset.

In [6]:
df = pd.read_csv('36100651.csv')

print(df.info())
print(df.head(10))

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 105840 entries, 0 to 105839
Data columns (total 17 columns):
 #   Column           Non-Null Count   Dtype  
---  ------           --------------   -----  
 0   REF_DATE         105840 non-null  int64  
 1   GEO              105840 non-null  object 
 2   DGUID            105840 non-null  object 
 3   Sector           105840 non-null  object 
 4   Characteristics  105840 non-null  object 
 5   Indicators       105840 non-null  object 
 6   UOM              105840 non-null  object 
 7   UOM_ID           105840 non-null  int64  
 8   SCALAR_FACTOR    105840 non-null  object 
 9   SCALAR_ID        105840 non-null  int64  
 10  VECTOR           105840 non-null  object 
 11  COORDINATE       105840 non-null  object 
 12  VALUE            102816 non-null  float64
 13  STATUS           3024 non-null    object 
 14  SYMBOL           0 non-null       float64
 15  TERMINATED       0 non-null       float64
 16  DECIMALS         105840 non-null  int6

Filter only the essential columns of the original dataset.

In [7]:
print("Grab the only the essential part of database.")

# From the original, 
# DGUID, UOM_ID, SCALAR_ID, VECTOR, COORDINATE, STATUS, SYMBOL, TERMINATED, and DECIMALS columns are removed.

df_sorted = df[['REF_DATE','GEO','Sector','Characteristics','Indicators','UOM','SCALAR_FACTOR','VALUE']]

print(df_sorted.head(20))
print(df_sorted.info())

print("Sort by Characteristics")
grouped = df_sorted.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.size]))

print("Sort by Indicator")
grouped = df_sorted.groupby(['Indicators'])
print(grouped['VALUE'].agg([np.size]))


Grab the only the essential part of database.
    REF_DATE     GEO                         Sector      Characteristics  \
0       2010  Canada  Total non-profit institutions       Male employees   
1       2010  Canada  Total non-profit institutions       Male employees   
2       2010  Canada  Total non-profit institutions       Male employees   
3       2010  Canada  Total non-profit institutions       Male employees   
4       2010  Canada  Total non-profit institutions       Male employees   
5       2010  Canada  Total non-profit institutions       Male employees   
6       2010  Canada  Total non-profit institutions       Male employees   
7       2010  Canada  Total non-profit institutions     Female employees   
8       2010  Canada  Total non-profit institutions     Female employees   
9       2010  Canada  Total non-profit institutions     Female employees   
10      2010  Canada  Total non-profit institutions     Female employees   
11      2010  Canada  Total non-profit ins

Check for the missing value from the sorted dataset done above.
* Notice there is missing value in this dataset.

In [8]:
print("Original database null counter")
print(df.isnull().sum())
print("\n Modified dataset null counter.")
print(df_sorted.isnull().sum())

Original database null counter
REF_DATE                0
GEO                     0
DGUID                   0
Sector                  0
Characteristics         0
Indicators              0
UOM                     0
UOM_ID                  0
SCALAR_FACTOR           0
SCALAR_ID               0
VECTOR                  0
COORDINATE              0
VALUE                3024
STATUS             102816
SYMBOL             105840
TERMINATED         105840
DECIMALS                0
dtype: int64

 Modified dataset null counter.
REF_DATE              0
GEO                   0
Sector                0
Characteristics       0
Indicators            0
UOM                   0
SCALAR_FACTOR         0
VALUE              3024
dtype: int64


Dropping missing value from the sorted dataset.

In [9]:
df_sorted_na = df_sorted.dropna()

Check now if there's still a missing data inside modified sorted dataset done above.

In [10]:
print("Modified dataset modification after removing missing value and it's total counter")
print(df_sorted_na.isnull().sum())
# print(df_sorted_na.head(20))

print(df_sorted_na.info())
grouped = df_sorted_na.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.size]))

grouped = df_sorted_na.groupby(['Indicators'])
print(grouped['VALUE'].agg([np.size]))

Modified dataset modification after removing missing value and it's total counter
REF_DATE           0
GEO                0
Sector             0
Characteristics    0
Indicators         0
UOM                0
SCALAR_FACTOR      0
VALUE              0
dtype: int64
<class 'pandas.core.frame.DataFrame'>
Index: 102816 entries, 0 to 105839
Data columns (total 8 columns):
 #   Column           Non-Null Count   Dtype  
---  ------           --------------   -----  
 0   REF_DATE         102816 non-null  int64  
 1   GEO              102816 non-null  object 
 2   Sector           102816 non-null  object 
 3   Characteristics  102816 non-null  object 
 4   Indicators       102816 non-null  object 
 5   UOM              102816 non-null  object 
 6   SCALAR_FACTOR    102816 non-null  object 
 7   VALUE            102816 non-null  float64
dtypes: float64(1), int64(1), object(6)
memory usage: 7.1+ MB
None
                                   size
Characteristics                        
15 to 24 years 

Panda Profiling for original dataset (CSV file),

In [11]:
# https://medium.com/analytics-vidhya/pandas-profiling-5ecd0b977ecd

pp = ProfileReport(df, title="Pandas Profiling Report")
pp_df = pp.to_html()

f = open("df_NoMod.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:   5%|▍         | 1/22 [00:01<00:41,  1.97s/it, Describe variable:REF_DATE]

Summarize dataset: 100%|██████████| 31/31 [00:08<00:00,  3.73it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:07<00:00,  7.20s/it]
Render HTML: 100%|██████████| 1/1 [00:01<00:00,  1.10s/it]


Panda Profiling for sorted dataset,

In [12]:
pp_sorted = ProfileReport(df_sorted, title="Pandas Profiling Report with Columns Sorted")
pp_df_sorted = pp_sorted.to_html()

f = open("df_Sorted.html", "a") # Expert modifying data into html file.
f.write(pp_df_sorted)
f.close()

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Summarize dataset: 100%|██████████| 21/21 [00:04<00:00,  4.71it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.68s/it]
Render HTML: 100%|██████████| 1/1 [00:01<00:00,  1.00s/it]


Panda Profiling for modified sorted dataset (missing data removed),

In [13]:
pp = ProfileReport(df_sorted_na, title="Pandas Profiling Report with Columned Sorted and NA Removed")
pp_df_sorted = pp.to_html()

f = open("df_Sorted-no-na.html", "a") # Expert modifying data into html file.
f.write(pp_df_sorted)
f.close()

Summarize dataset: 100%|██████████| 21/21 [00:04<00:00,  4.68it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.32s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  1.89it/s]


In [14]:
# Differences should be, there will be less data to work on.
# Particularly business non-profit organizations and community organizations haven't given more accurate data (more missing values).

Next step, I will filtered the dataset by all the 'Indicators' given below. All of them done with modified sorted dataset (filtered missing value)
* Notice there will be seven indicators data inside.
* Notice there will be divided by seven datasets based on indicators.

In [15]:
# All columns
print(df_sorted_na.info())

# All indicators
grouped = df_sorted_na.groupby(['Indicators'])
print(grouped['VALUE'].agg([np.size]))

<class 'pandas.core.frame.DataFrame'>
Index: 102816 entries, 0 to 105839
Data columns (total 8 columns):
 #   Column           Non-Null Count   Dtype  
---  ------           --------------   -----  
 0   REF_DATE         102816 non-null  int64  
 1   GEO              102816 non-null  object 
 2   Sector           102816 non-null  object 
 3   Characteristics  102816 non-null  object 
 4   Indicators       102816 non-null  object 
 5   UOM              102816 non-null  object 
 6   SCALAR_FACTOR    102816 non-null  object 
 7   VALUE            102816 non-null  float64
dtypes: float64(1), int64(1), object(6)
memory usage: 11.1+ MB
None
                                    size
Indicators                              
Average annual hours worked        14688
Average annual wages and salaries  14688
Average hourly wage                14688
Average weekly hours worked        14688
Hours worked                       14688
Number of jobs                     14688
Wages and salaries           

Average annual hours worked from modified sorted dataset.

In [16]:
# Average annual hours worked        15120
print("\nAverage annual hours worked")
df_AvgAnnHrsWrk = df_sorted_na.loc[
    (df_sorted_na['Indicators'] == 'Average annual hours worked')
]
# grouped = df_AvgAnnHrsWrk.groupby(['GEO'])
grouped = df_AvgAnnHrsWrk.groupby(['Indicators'])
print(grouped['VALUE'].agg([np.sum, np.mean, np.std, np.size]))
print("The total number of this one is ",len(df_AvgAnnHrsWrk.index))


Average annual hours worked
                                    sum         mean         std   size
Indicators                                                             
Average annual hours worked  22787494.0  1551.436138  252.784087  14688
The total number of this one is  14688


Panda Profiling only for "Average annual hours worked"

In [17]:
pp = ProfileReport(df_AvgAnnHrsWrk, title="Average annual hours worked")
pp_df = pp.to_html()

f = open("Average annual hours worked.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Summarize dataset: 100%|██████████| 21/21 [00:01<00:00, 11.05it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.20s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.03it/s]


Average annual wages and salaries from modified sorted dataset. (Mention above)

In [18]:
# Average annual wages and salaries  15120
print("\nAverage annual wages and salaries")
df_AvgAnnWages = df_sorted_na.loc[
    (df_sorted_na['Indicators'] == 'Average annual wages and salaries')
]
grouped = df_AvgAnnWages.groupby(['Indicators'])
print(grouped['VALUE'].agg([np.sum, np.mean, np.std, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages.index))


Average annual wages and salaries
                                           sum          mean           std  \
Indicators                                                                   
Average annual wages and salaries  643404649.0  43804.782748  16620.351087   

                                    size  
Indicators                                
Average annual wages and salaries  14688  
The total number of this one is  14688


Panda Profiling only for "Average annual wages and salaries"

In [19]:
pp = ProfileReport(df_AvgAnnWages, title="Average annual wages and salaries")
pp_df = pp.to_html()

f = open("Average annual wages and salaries.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset: 100%|██████████| 21/21 [00:01<00:00, 12.15it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:02<00:00,  2.97s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.01it/s]


Average hourly wage from modified sorted dataset. (Mentions above)

In [20]:
# Average hourly wage                15120
print("\nAverage hourly wage")
df_AvgHrsWages = df_sorted_na.loc[
    (df_sorted_na['Indicators'] == 'Average hourly wage')
]
grouped = df_AvgHrsWages.groupby(['Indicators'])
print(grouped['VALUE'].agg([np.sum, np.mean, np.std, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages.index))


Average hourly wage
                           sum       mean       std   size
Indicators                                                
Average hourly wage  408702.58  27.825611  8.601721  14688
The total number of this one is  14688


Panda Profiling only for "Average hourly wages"

In [21]:
pp = ProfileReport(df_AvgHrsWages, title="Average hourly wage")
pp_df = pp.to_html()

f = open("Average hourly wages.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Summarize dataset: 100%|██████████| 21/21 [00:01<00:00, 11.72it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.17s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.03it/s]


Average weekly hours worked from modified sorted dataset.

In [22]:
# Average weekly hours worked        15120
print("\nAverage weekly hours worked")
df_AvgWeekHrsWrked = df_sorted_na.loc[
    (df_sorted_na['Indicators'] == 'Average weekly hours worked')
]
grouped = df_AvgWeekHrsWrked.groupby(['Indicators'])
print(grouped['VALUE'].agg([np.sum, np.mean, np.std, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked.index))


Average weekly hours worked
                                  sum       mean      std   size
Indicators                                                      
Average weekly hours worked  438169.0  29.831767  4.86689  14688
The total number of this one is  14688


Panda Profiling only for "Average weekly hours worked"

In [23]:
pp = ProfileReport(df_AvgWeekHrsWrked, title="Average weekly hours worked")
pp_df = pp.to_html()

f = open("Average weekly hours worked.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset: 100%|██████████| 21/21 [00:01<00:00, 11.16it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.09s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  1.98it/s]


Hours worked from modified sorted dataset.
* Notice, Skewed left.

In [24]:
# Hours worked                       15120
print("\nHours worked")
df_Hrs_Wrked = df_sorted_na.loc[
    (df_sorted_na['Indicators'] == 'Hours worked')
]
grouped = df_Hrs_Wrked.groupby(['Indicators'])
print(grouped['VALUE'].agg([np.sum, np.mean, np.std, np.size]))
print(grouped['VALUE'].agg([np.amin, np.amax]))
print("The total number of this one is ",len(df_Hrs_Wrked.index))


Hours worked
                       sum          mean            std   size
Indicators                                                    
Hours worked  1.227872e+09  83596.946283  253684.449101  14688
              amin       amax
Indicators                   
Hours worked   6.0  3857813.0
The total number of this one is  14688


Panda Profiling only for "Hours worked" (Skewed left, noticed)

In [25]:
pp = ProfileReport(df_Hrs_Wrked, title="Hours Worked")
pp_df = pp.to_html()

f = open("Hours worked.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:   8%|▊         | 1/13 [00:00<00:01,  9.42it/s, Describe variable:VALUE]   

Summarize dataset: 100%|██████████| 21/21 [00:01<00:00, 10.59it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.15s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.15it/s]


Number of jobs from modified sorted dataset.
* Notice, skewed left.

In [26]:
# Number of jobs                     15120
print("\nNumber of jobs")
df_NumOfJob = df_sorted_na.loc[
    (df_sorted_na['Indicators'] == 'Number of jobs')
]
grouped = df_NumOfJob.groupby(['Indicators'])
print(grouped['VALUE'].agg([np.sum, np.mean, np.std, np.size]))
print(grouped['VALUE'].agg([np.amin, np.amax]))
print("The total number of this one is ",len(df_NumOfJob.index))


Number of jobs
                        sum          mean            std   size
Indicators                                                     
Number of jobs  784942325.0  53441.062432  161120.806948  14688
                amin       amax
Indicators                     
Number of jobs  11.0  2428289.0
The total number of this one is  14688


Panda Profiling only for "Number of the jobs" (Stewed toward left)

In [27]:
pp = ProfileReport(df_NumOfJob, title="Number of jobs")
pp_df = pp.to_html()

f = open("Number of jobs.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset: 100%|██████████| 21/21 [00:01<00:00, 11.04it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:02<00:00,  2.89s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.21it/s]


Wages and salaries from modified sorted dataset.

* Noticed skewed left.

In [28]:
# Wages and salaries                 15120
print("\nWages and salaries")
df_WagesAndSalaries = df_sorted_na.loc[
    (df_sorted_na['Indicators'] == 'Wages and salaries')
]
grouped = df_WagesAndSalaries.groupby(['Indicators'])
print(grouped['VALUE'].agg([np.sum, np.mean, np.std, np.size]))
print(grouped['VALUE'].agg([np.amin, np.amax]))
print("The total number of this one is ",len(df_WagesAndSalaries.index))


Wages and salaries
                           sum         mean          std   size
Indicators                                                     
Wages and salaries  36498479.0  2484.918233  7977.441388  14688
                    amin      amax
Indicators                        
Wages and salaries   0.0  132601.0
The total number of this one is  14688


Panda Profiling only for "Wages and salaries" (Strewed toward left)

In [29]:
pp = ProfileReport(df_WagesAndSalaries, title="Wages and Salaries")
pp_df = pp.to_html()

f = open("Wages and salaries.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Summarize dataset: 100%|██████████| 21/21 [00:01<00:00, 12.17it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.25s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.03it/s]


For next step, I will divide each Indicators dataset into three different datasets.<br />
They are 2013-2015, 2016-2018, 2019-2021.<br />
I have dataset prepared before 2010-2012. However, it will not be used after this section, there's too much to analysis to do.<br />
It will also demonstrate here why.
Originally, I used divide dataset into 2016-2017, 2018-2019, and 2020-2021 from dataset that was divided from 2016 and up.

In [30]:
print("There are seven Indicators to analysis,")
grouped = df_sorted_na.groupby('Indicators')
print(grouped['VALUE'].agg([np.size]))

print("\nThe data inside between 2010-2013, there's are # number of data and I will be repeating this seven more time.,")
df_Avg_Sample = df_AvgAnnHrsWrk.loc[
    (df_AvgAnnHrsWrk['REF_DATE'] == 2010) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2011) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2012)
]

grouped = df_Avg_Sample.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.size]))

print("\nTo data inside above 2013 and split into three datasets, I need to repeat this analysis for "+str(7*3)+" (7x3) times.")
print("\nThis is also total of spliting into "+str(7*3)+" datasets.")

df_Avg_Sample_2013 = df_AvgAnnHrsWrk.loc[
    (df_AvgAnnHrsWrk['REF_DATE'] == 2013) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2014) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2015)
]

df_Avg_Sample_2016 = df_AvgAnnHrsWrk.loc[
    (df_AvgAnnHrsWrk['REF_DATE'] == 2016) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2017) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2018)
]

df_Avg_Sample_2019 = df_AvgAnnHrsWrk.loc[
    (df_AvgAnnHrsWrk['REF_DATE'] == 2019) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2020) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2021)
]

grouped = df_Avg_Sample_2013.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.size]))

grouped = df_Avg_Sample_2016.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.size]))

grouped = df_Avg_Sample_2019.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.size]))

There are seven Indicators to analysis,
                                    size
Indicators                              
Average annual hours worked        14688
Average annual wages and salaries  14688
Average hourly wage                14688
Average weekly hours worked        14688
Hours worked                       14688
Number of jobs                     14688
Wages and salaries                 14688

The data inside between 2010-2013, there's are # number of data and I will be repeating this seven more time.,
          size
REF_DATE      
2010      1224
2011      1224
2012      1224

To data inside above 2013 and split into three datasets, I need to repeat this analysis for 21 (7x3) times.

This is also total of spliting into 21 datasets.
          size
REF_DATE      
2013      1224
2014      1224
2015      1224
          size
REF_DATE      
2016      1224
2017      1224
2018      1224
          size
REF_DATE      
2019      1224
2020      1224
2021      1224


Grabbing the year (REF_DATE) from 2010, 2013, 2016, 2018, and 2019 individually for "Average annual hours worked".

In [31]:
# 2010-2012
df_AvgAnnHrsWrk_2010 = df_AvgAnnHrsWrk.loc[
    (df_AvgAnnHrsWrk['REF_DATE'] == 2010) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2011) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2012)
]

grouped = df_AvgAnnHrsWrk_2010.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

                sum  size
REF_DATE                 
2010      1905000.0  1224
2011      1909823.0  1224
2012      1915650.0  1224


In [32]:
print("Grabbing the data from 2013, 2016, and 2019.")

# 2013 - 2015
df_AvgAnnHrsWrk_2013 = df_AvgAnnHrsWrk.loc[
    (df_AvgAnnHrsWrk['REF_DATE'] == 2013) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2014) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2015)
]

# 2016 - 2018
df_AvgAnnHrsWrk_2016 = df_AvgAnnHrsWrk.loc[
    (df_AvgAnnHrsWrk['REF_DATE'] == 2016) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2017) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2018)
]

# 20109- 2021
df_AvgAnnHrsWrk_2019 = df_AvgAnnHrsWrk.loc[
    (df_AvgAnnHrsWrk['REF_DATE'] == 2019) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2020) |
    (df_AvgAnnHrsWrk['REF_DATE'] == 2021)
]

grouped = df_AvgAnnHrsWrk_2013.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_AvgAnnHrsWrk_2016.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_AvgAnnHrsWrk_2019.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

Grabbing the data from 2013, 2016, and 2019.
                sum  size
REF_DATE                 
2013      1906851.0  1224
2014      1898034.0  1224
2015      1907286.0  1224
                sum  size
REF_DATE                 
2016      1899738.0  1224
2017      1881389.0  1224
2018      1894227.0  1224
                sum  size
REF_DATE                 
2019      1894126.0  1224
2020      1873544.0  1224
2021      1901826.0  1224


Panda Profiling for year 2016, 2018, and 2020 for "Average annual hours worked".

In [33]:
# 2016-2017
pp = ProfileReport(df_AvgAnnHrsWrk_2013, title="Average annual hours worked 2013")
pp_df = pp.to_html()

f = open("Average annual hours worked 2016.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

# 2017 - 2019
pp = ProfileReport(df_AvgAnnHrsWrk_2016, title="Average annual hours worked 2016")
pp_df = pp.to_html()

f = open("Average annual hours worked 2018.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

# 2020 - 2021
pp = ProfileReport(df_AvgAnnHrsWrk_2019, title="Average annual hours worked 2019")
pp_df = pp.to_html()

f = open("Average annual hours worked 2020.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:  23%|██▎       | 3/13 [00:00<00:00, 11.96it/s, Describe variable:UOM]   

Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 14.43it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.02s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.30it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.61it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.10s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.59it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 14.17it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.28s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.36it/s]


Grabbing the year (REF_DATE) from 2010, 2013, 2016, and 2019 individually for "Average annual wages and salaries".

In [34]:
# 2010 - 2012
df_AvgAnnWages_2010 = df_AvgAnnWages.loc[
    (df_AvgAnnWages['REF_DATE'] == 2010) |
    (df_AvgAnnWages['REF_DATE'] == 2011) |
    (df_AvgAnnWages['REF_DATE'] == 2012)
]

grouped = df_AvgAnnWages_2010.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

                 sum  size
REF_DATE                  
2010      48093832.0  1224
2011      49165720.0  1224
2012      50085622.0  1224


In [35]:
print("Grabbing the data from 2017, 2019, and 2021.")

# 2013 - 2015
df_AvgAnnWages_2013 = df_AvgAnnWages.loc[
    (df_AvgAnnWages['REF_DATE'] == 2013) |
    (df_AvgAnnWages['REF_DATE'] == 2014) |
    (df_AvgAnnWages['REF_DATE'] == 2015)
]

# 2016 - 2018
df_AvgAnnWages_2016 = df_AvgAnnWages.loc[
    (df_AvgAnnWages['REF_DATE'] == 2016) |
    (df_AvgAnnWages['REF_DATE'] == 2017) |
    (df_AvgAnnWages['REF_DATE'] == 2018)
]

# 2019 - 2021
df_AvgAnnWages_2019 = df_AvgAnnWages.loc[
    (df_AvgAnnWages['REF_DATE'] == 2019) |
    (df_AvgAnnWages['REF_DATE'] == 2020) |
    (df_AvgAnnWages['REF_DATE'] == 2021)
]

grouped = df_AvgAnnWages_2013.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_AvgAnnWages_2016.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_AvgAnnWages_2019.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

Grabbing the data from 2017, 2019, and 2021.
                 sum  size
REF_DATE                  
2013      50598135.0  1224
2014      51805889.0  1224
2015      52715143.0  1224
                 sum  size
REF_DATE                  
2016      53166285.0  1224
2017      53965359.0  1224
2018      55525920.0  1224
                 sum  size
REF_DATE                  
2019      56997121.0  1224
2020      60597775.0  1224
2021      60687848.0  1224


Panda Profiling for year 2013, 2016, and 2019 for "Average annual wages and salaries".

In [36]:
pp = ProfileReport(df_AvgAnnWages_2013, title="Average annual wages and salaries 2013")
pp_df = pp.to_html()

f = open("Average annual wages and salaries 2013.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_AvgAnnWages_2016, title="Average annual wages and salaries 2016")
pp_df = pp.to_html()

f = open("Average annual wages and salaries 2016.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_AvgAnnWages_2019, title="Average annual wages and salaries 2019")
pp_df = pp.to_html()

f = open("Average annual wages and salaries 2019.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:  23%|██▎       | 3/13 [00:00<00:00, 10.31it/s, Describe variable:Characteristics]

Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 14.75it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.03s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.50it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 13.16it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:02<00:00,  2.90s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.37it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.52it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.07s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.44it/s]


Grabbing the year (REF_DATE) from 2010, 2013, 2016, and 2019 individually for "Average hourly wages".

In [37]:
# 2010 - 2012
df_AvgHrsWages_2010 = df_AvgHrsWages.loc[
    (df_AvgHrsWages['REF_DATE'] == 2010) |
    (df_AvgHrsWages['REF_DATE'] == 2011) |
    (df_AvgHrsWages['REF_DATE'] == 2012)
]

grouped = df_AvgHrsWages_2010.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

               sum  size
REF_DATE                
2010      30453.20  1224
2011      31053.73  1224
2012      31511.37  1224


In [38]:
print("Grabbing the data from 2013, 2016, and 2019.")

# 2013 - 2015
df_AvgHrsWages_2013 = df_AvgHrsWages.loc[
    (df_AvgHrsWages['REF_DATE'] == 2013) |
    (df_AvgHrsWages['REF_DATE'] == 2014) |
    (df_AvgHrsWages['REF_DATE'] == 2015)
]

# 2016 - 2018
df_AvgHrsWages_2016 = df_AvgHrsWages.loc[
    (df_AvgHrsWages['REF_DATE'] == 2016) |
    (df_AvgHrsWages['REF_DATE'] == 2017) |
    (df_AvgHrsWages['REF_DATE'] == 2018)
]

# 2019 - 2021
df_AvgHrsWages_2019 = df_AvgHrsWages.loc[
    (df_AvgHrsWages['REF_DATE'] == 2019) |
    (df_AvgHrsWages['REF_DATE'] == 2020) |
    (df_AvgHrsWages['REF_DATE'] == 2021)
]

grouped = df_AvgHrsWages_2013.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_AvgHrsWages_2016.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_AvgHrsWages_2019.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

Grabbing the data from 2013, 2016, and 2019.
               sum  size
REF_DATE                
2013      31932.19  1224
2014      32916.44  1224
2015      33362.79  1224
               sum  size
REF_DATE                
2016      33756.84  1224
2017      34571.41  1224
2018      35356.16  1224
               sum  size
REF_DATE                
2019      36293.34  1224
2020      38980.38  1224
2021      38514.73  1224


Panda Profiling for year 2013, 2016, and 2019 for "Average hourly wages".

In [39]:
pp = ProfileReport(df_AvgHrsWages_2013, title="Average hourly wage 2013")
pp_df = pp.to_html()

f = open("Average hourly wages 2013.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_AvgHrsWages_2016, title="Average hourly wage 2016")
pp_df = pp.to_html()

f = open("Average hourly wages 2016.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_AvgHrsWages_2019, title="Average hourly wage 2019")
pp_df = pp.to_html()

f = open("Average hourly wages 2019.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 14.87it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.24s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.31it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.09it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.12s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.41it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 17.37it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.26s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.49it/s]


Grabbing the year (REF_DATE) from 2010, 2013, 2016, and 2019 individually for "Average weekly hours worked".

In [40]:
# 2010 - 2012
df_AvgWeekHrsWrked_2010 = df_AvgWeekHrsWrked.loc[
    (df_AvgWeekHrsWrked['REF_DATE'] == 2010) |
    (df_AvgWeekHrsWrked['REF_DATE'] == 2011) |
    (df_AvgWeekHrsWrked['REF_DATE'] == 2012)
]

grouped = df_AvgWeekHrsWrked_2010.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

              sum  size
REF_DATE               
2010      36645.0  1224
2011      36732.0  1224
2012      36842.0  1224


In [41]:
print("Grabbing the data from 2013, 2016, and 2019.")

# 2013 - 2015
df_AvgWeekHrsWrked_2013 = df_AvgWeekHrsWrked.loc[
    (df_AvgWeekHrsWrked['REF_DATE'] == 2013) |
    (df_AvgWeekHrsWrked['REF_DATE'] == 2014) |
    (df_AvgWeekHrsWrked['REF_DATE'] == 2015)
]

# 2016 - 2018
df_AvgWeekHrsWrked_2016 = df_AvgWeekHrsWrked.loc[
    (df_AvgWeekHrsWrked['REF_DATE'] == 2016) |
    (df_AvgWeekHrsWrked['REF_DATE'] == 2017) |
    (df_AvgWeekHrsWrked['REF_DATE'] == 2018)
]

# 2019 - 2021
df_AvgWeekHrsWrked_2019 = df_AvgWeekHrsWrked.loc[
    (df_AvgWeekHrsWrked['REF_DATE'] == 2019) |
    (df_AvgWeekHrsWrked['REF_DATE'] == 2020) |
    (df_AvgWeekHrsWrked['REF_DATE'] == 2021)
]

grouped = df_AvgWeekHrsWrked_2013.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_AvgWeekHrsWrked_2016.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_AvgWeekHrsWrked_2019.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

Grabbing the data from 2013, 2016, and 2019.
              sum  size
REF_DATE               
2013      36671.0  1224
2014      36483.0  1224
2015      36678.0  1224
              sum  size
REF_DATE               
2016      36541.0  1224
2017      36170.0  1224
2018      36416.0  1224
              sum  size
REF_DATE               
2019      36400.0  1224
2020      36036.0  1224
2021      36555.0  1224


Panda Profiling for year 2013, 2016, and 2019 for "Average weekly hours worked".

In [42]:
pp = ProfileReport(df_AvgWeekHrsWrked_2013, title="Average weekly hours worked 2013")
pp_df = pp.to_html()

f = open("Average weekly hours worked 2013.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_AvgWeekHrsWrked_2016, title="Average weekly hours worked 2016")
pp_df = pp.to_html()

f = open("Average weekly hours worked 2016.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_AvgWeekHrsWrked_2019, title="Average weekly hours worked 2019")
pp_df = pp.to_html()

f = open("Average weekly hours worked 2019.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.52it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:02<00:00,  2.93s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.41it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 15.84it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.09s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  1.53it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.00it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.22s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.23it/s]


Grabbing the year (REF_DATE) from 2010, 2013, 2016, and 2019 individually for "hours worked".

In [43]:
# 2010 - 2012
df_Hrs_Wrked_2010 = df_Hrs_Wrked.loc[
    (df_Hrs_Wrked['REF_DATE'] == 2010) |
    (df_Hrs_Wrked['REF_DATE'] == 2011) |
    (df_Hrs_Wrked['REF_DATE'] == 2012)
]

grouped = df_Hrs_Wrked_2010.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

                 sum  size
REF_DATE                  
2010      94217881.0  1224
2011      95953248.0  1224
2012      97724232.0  1224


In [44]:
print("Grabbing the data from 2013, 2016, and 2019.")

# 2013 - 2015
df_Hrs_Wrked_2013 = df_Hrs_Wrked.loc[
    (df_Hrs_Wrked['REF_DATE'] == 2013) |
    (df_Hrs_Wrked['REF_DATE'] == 2014) |
    (df_Hrs_Wrked['REF_DATE'] == 2015)
]

# 2016 - 2018
df_Hrs_Wrked_2016 = df_Hrs_Wrked.loc[
    (df_Hrs_Wrked['REF_DATE'] == 2016) |
    (df_Hrs_Wrked['REF_DATE'] == 2017) |
    (df_Hrs_Wrked['REF_DATE'] == 2018)
]

# 2019 - 2021
df_Hrs_Wrked_2019 = df_Hrs_Wrked.loc[
    (df_Hrs_Wrked['REF_DATE'] == 2019) |
    (df_Hrs_Wrked['REF_DATE'] == 2020) |
    (df_Hrs_Wrked['REF_DATE'] == 2021)
]

grouped = df_Hrs_Wrked_2013.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_Hrs_Wrked_2016.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_Hrs_Wrked_2019.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

Grabbing the data from 2013, 2016, and 2019.
                  sum  size
REF_DATE                   
2013       98935086.0  1224
2014       99777902.0  1224
2015      101927894.0  1224
                  sum  size
REF_DATE                   
2016      103980992.0  1224
2017      103906357.0  1224
2018      106885614.0  1224
                  sum  size
REF_DATE                   
2019      108384508.0  1224
2020      103897606.0  1224
2021      112280627.0  1224


Panda Profiling for year 2013, 2016, and 2019 for "hours worked".

In [45]:
pp = ProfileReport(df_Hrs_Wrked_2013, title="Hours Worked 2013")
pp_df = pp.to_html()

f = open("Hours worked 2013.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_Hrs_Wrked_2016, title="Hours Worked 2016")
pp_df = pp.to_html()

f = open("Hours worked 2016.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_Hrs_Wrked_2019, title="Hours Worked 2019")
pp_df = pp.to_html()

f = open("Hours worked 2019.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:  23%|██▎       | 3/13 [00:00<00:00, 12.90it/s, Describe variable:UOM]  

Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 13.73it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.15s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.29it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.34it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.29s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.26it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 15.06it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.21s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.30it/s]


Grabbing the year (REF_DATE) from 2010, 2013, 2016, and 2019 individually for "Number of jobs".

In [46]:
# 2010 - 2012
df_NumOfJob_2010 = df_NumOfJob.loc[
    (df_NumOfJob['REF_DATE'] == 2010) |
    (df_NumOfJob['REF_DATE'] == 2011) |
    (df_NumOfJob['REF_DATE'] == 2012)
]

grouped = df_NumOfJob_2010.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

                 sum  size
REF_DATE                  
2010      59793067.0  1224
2011      61122016.0  1224
2012      61949430.0  1224


In [47]:
print("Grabbing the data from 2013, 2016, and 2019.")

# 2013 - 2015
df_NumOfJob_2013 = df_NumOfJob.loc[
    (df_NumOfJob['REF_DATE'] == 2013) |
    (df_NumOfJob['REF_DATE'] == 2014) |
    (df_NumOfJob['REF_DATE'] == 2015)
]

# 2016 - 2018
df_NumOfJob_2016 = df_NumOfJob.loc[
    (df_NumOfJob['REF_DATE'] == 2016) |
    (df_NumOfJob['REF_DATE'] == 2017) |
    (df_NumOfJob['REF_DATE'] == 2018)
]

# 2019 - 2021
df_NumOfJob_2019 = df_NumOfJob.loc[
    (df_NumOfJob['REF_DATE'] == 2019) |
    (df_NumOfJob['REF_DATE'] == 2020) |
    (df_NumOfJob['REF_DATE'] == 2021)
]

grouped = df_NumOfJob_2013.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_NumOfJob_2016.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_NumOfJob_2019.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

Grabbing the data from 2013, 2016, and 2019.
                 sum  size
REF_DATE                  
2013      63161481.0  1224
2014      63980381.0  1224
2015      65239031.0  1224
                 sum  size
REF_DATE                  
2016      66578386.0  1224
2017      66979678.0  1224
2018      68215341.0  1224
                 sum  size
REF_DATE                  
2019      69702012.0  1224
2020      67249987.0  1224
2021      70971515.0  1224


Panda Profiling for year 2013, 2016, and 2019for "Number of jobs".

In [48]:
pp = ProfileReport(df_NumOfJob_2013, title="Number of jobs 2013")
pp_df = pp.to_html()

f = open("Number of jobs 2013.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_NumOfJob_2016, title="Number of jobs 2016")
pp_df = pp.to_html()

f = open("Number of jobs 2016.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_NumOfJob_2019, title="Number of jobs 2019")
pp_df = pp.to_html()

f = open("Number of jobs 2019.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:   8%|▊         | 1/13 [00:00<00:02,  5.62it/s, Describe variable:Sector]  

Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 14.47it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.31s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.45it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.02it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.05s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.30it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.09it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.38s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.50it/s]


Grabbing the year (REF_DATE) from 2010, 2013, 2016, and 2019 individually for "Wages and Salaries".

In [49]:
# 2010 - 2012
df_WagesAndSalaries_2010 = df_WagesAndSalaries.loc[
    (df_WagesAndSalaries['REF_DATE'] == 2010) |
    (df_WagesAndSalaries['REF_DATE'] == 2011) |
    (df_WagesAndSalaries['REF_DATE'] == 2012)
]

grouped = df_WagesAndSalaries_2010.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

                sum  size
REF_DATE                 
2010      2489681.0  1224
2011      2604921.0  1224
2012      2687510.0  1224


In [50]:
print("Grabbing the data from 2013, 2016, and 2019.")

# 2013 - 2015
df_WagesAndSalaries_2013 = df_WagesAndSalaries.loc[
    (df_WagesAndSalaries['REF_DATE'] == 2013) |
    (df_WagesAndSalaries['REF_DATE'] == 2014) |
    (df_WagesAndSalaries['REF_DATE'] == 2015)
]

# 2016 - 2018
df_WagesAndSalaries_2016 = df_WagesAndSalaries.loc[
    (df_WagesAndSalaries['REF_DATE'] == 2016) |
    (df_WagesAndSalaries['REF_DATE'] == 2017) |
    (df_WagesAndSalaries['REF_DATE'] == 2018)
]

# 2019 - 2021
df_WagesAndSalaries_2019 = df_WagesAndSalaries.loc[
    (df_WagesAndSalaries['REF_DATE'] == 2019) |
    (df_WagesAndSalaries['REF_DATE'] == 2020) |
    (df_WagesAndSalaries['REF_DATE'] == 2021)
]

grouped = df_WagesAndSalaries_2013.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_WagesAndSalaries_2016.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

grouped = df_WagesAndSalaries_2019.groupby(['REF_DATE'])
print(grouped['VALUE'].agg([np.sum, np.size]))

Grabbing the data from 2013, 2016, and 2019.
                sum  size
REF_DATE                 
2013      2746199.0  1224
2014      2844225.0  1224
2015      2965770.0  1224
                sum  size
REF_DATE                 
2016      3043217.0  1224
2017      3118809.0  1224
2018      3263527.0  1224
                sum  size
REF_DATE                 
2019      3437480.0  1224
2020      3532888.0  1224
2021      3764252.0  1224


Panda Profiling for year 2013, 2016, and 2019 for "Wages and Salaries".

In [51]:
pp = ProfileReport(df_WagesAndSalaries_2013, title="Wages and Salaries 2013")
pp_df = pp.to_html()

f = open("Wages and Salaries 2013.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_WagesAndSalaries_2016, title="Wages and Salaries 2016")
pp_df = pp.to_html()

f = open("Wages and Salaries 2016.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

pp = ProfileReport(df_WagesAndSalaries_2019, title="Wages and Salaries 2019")
pp_df = pp.to_html()

f = open("Wages and Salaries 2019.html", "a")  # Expert into html file without modifying any columns in dataset.
f.write(pp_df)
f.close()

Summarize dataset:   8%|▊         | 1/13 [00:00<00:00, 13.02it/s, Describe variable:VALUE]

Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 14.22it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.14s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.46it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.79it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.27s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.37it/s]
Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 17.76it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.18s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.45it/s]


For next step, I will filtered it by the following group, "age group", "gender level", "education level", "immigrant level" and "Aboriginal status".
* I have comment, analysis of whole dataset, 2010-2013 (originally, before 2016).
* I have analysis two training and testing set. (2013-2015), (2016-2018), (2019-2021)
* There's also other characteristics there as well but I decided to drop them as well.

Filtered for "Average annual hours worked" by following: "Age group", "Gender level", "Education level", and "Immigration status".<br />
"Aboriginal status" has been commented.

In [52]:
# Dataset year in 2010 inside Average Annual Hours Worked

# print("\nAge group in Alberta")
# df_AvgAnnHrsWrk_2010_ByAge = df_AvgAnnHrsWrk_2010.loc[
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == '15 to 24 years') |
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == '25 to 34 years') |
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == '35 to 44 years') |
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == '45 to 54 years') |
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == '55 to 64 years') |
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == '65 years old and over')]
# # print(df_AvgAnnHrsWrk_2010_ByAge.head(20))
# grouped = df_AvgAnnHrsWrk_2010_ByAge.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("Total size : ",len(df_AvgAnnHrsWrk_2010_ByAge.index))

# print("\nGender group in Alberta")
# df_AvgAnnHrsWrk_2010_ByGender = df_AvgAnnHrsWrk_2010.loc[
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == 'Female employees') |
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == 'Male employees')
# ]
# # print(df_AvgAnnHrsWrk_2010_ByGender.head(20))
# grouped = df_AvgAnnHrsWrk_2010_ByGender.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("Total size : ",len(df_AvgAnnHrsWrk_2010_ByGender.index))

# print("\nEducation group in Alberta")
# df_AvgAnnHrsWrk_2010_ByEducation = df_AvgAnnHrsWrk_2010.loc[
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == 'High school diploma and less') |
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == 'Trade certificate') |
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == 'University degree and higher')
# ]
# # print(df_AvgAnnHrsWrk_2010_ByEducation.head(20))
# grouped = df_AvgAnnHrsWrk_2010_ByEducation.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("Total size : ",len(df_AvgAnnHrsWrk_2010_ByEducation.index))

# print("\nImmigrant group in Alberta")
# df_AvgAnnHrsWrk_2010_ByImmigrant = df_AvgAnnHrsWrk_2010.loc[
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == 'Immigrant employees') |
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == 'Non-immigrant employees')
# ]
# # print(df_AvgAnnHrsWrk_2010_ByImmigrant.head(20))
# grouped = df_AvgAnnHrsWrk_2010_ByImmigrant.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("Total size : ",len(df_AvgAnnHrsWrk_2010_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgAnnHrkWrk_2010_ByIndigenous = df_AvgAnnHrsWrk_2010.loc[
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgAnnHrsWrk_2010['Characteristics'] == 'Non-indigenous identity employees')
# ]
# print(df_AvgAnnHrkWrk_2010_ByIndigenous.head(20))
# # grouped = df_AvgAnnHrkWrk_2010_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnHrkWrk_2010_ByIndigenous.index))

In [53]:
# Dataset year in 2013 inside Average Annual Hours Worked

print("\nAge group in Alberta")
df_AvgAnnHrsWrk_2013_ByAge = df_AvgAnnHrsWrk_2013.loc[
    (df_AvgAnnHrsWrk_2013['Characteristics'] == '15 to 24 years') |
    (df_AvgAnnHrsWrk_2013['Characteristics'] == '25 to 34 years') |
    (df_AvgAnnHrsWrk_2013['Characteristics'] == '35 to 44 years') |
    (df_AvgAnnHrsWrk_2013['Characteristics'] == '45 to 54 years') |
    (df_AvgAnnHrsWrk_2013['Characteristics'] == '55 to 64 years') |
    (df_AvgAnnHrsWrk_2013['Characteristics'] == '65 years old and over')]
# print(df_AvgAnnHrsWrk_2013_ByAge.head(20))
grouped = df_AvgAnnHrsWrk_2013_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2013_ByAge.index))

print("\nGender group in Alberta")
df_AvgAnnHrsWrk_2013_ByGender = df_AvgAnnHrsWrk_2013.loc[
    (df_AvgAnnHrsWrk_2013['Characteristics'] == 'Female employees') |
    (df_AvgAnnHrsWrk_2013['Characteristics'] == 'Male employees')
]
# print(df_AvgAnnHrsWrk_2013_ByGender.head(20))
grouped = df_AvgAnnHrsWrk_2013_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2013_ByGender.index))

print("\nEducation group in Alberta")
df_AvgAnnHrsWrk_2013_ByEducation = df_AvgAnnHrsWrk_2013.loc[
    (df_AvgAnnHrsWrk_2013['Characteristics'] == 'High school diploma and less') |
    (df_AvgAnnHrsWrk_2013['Characteristics'] == 'Trade certificate') |
    (df_AvgAnnHrsWrk_2013['Characteristics'] == 'University degree and higher')
]
# print(df_AvgAnnHrsWrk_2013_ByEducation.head(20))
grouped = df_AvgAnnHrsWrk_2013_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2013_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgAnnHrsWrk_2013_ByImmigrant = df_AvgAnnHrsWrk_2013.loc[
    (df_AvgAnnHrsWrk_2013['Characteristics'] == 'Immigrant employees') |
    (df_AvgAnnHrsWrk_2013['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgAnnHrsWrk_2013_ByImmigrant.head(20))
grouped = df_AvgAnnHrsWrk_2013_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2013_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgAnnHrsWrk_2013_ByIndigenous = df_AvgAnnHrsWrk_2013.loc[
#     (df_AvgAnnHrsWrk_2013['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgAnnHrsWrk_2013['Characteristics'] == 'Non-indigenous identity employees')
# ]
# print(df_AvgAnnHrsWrk_2013_ByIndigenous.head(20))
# # grouped = df_AvgAnnHrsWrk_2013_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnHrsWrk_2013_ByIndigenous.index))


Age group in Alberta
                            sum  size
Characteristics                      
15 to 24 years         172087.0   192
25 to 34 years         340647.0   210
35 to 44 years         377140.0   210
45 to 54 years         387957.0   210
55 to 64 years         352301.0   210
65 years old and over  207070.0   192
Total size :  1224

Gender group in Alberta
                       sum  size
Characteristics                 
Female employees  322203.0   210
Male employees    351902.0   210
Total size :  420

Education group in Alberta
                                   sum  size
Characteristics                             
High school diploma and less  277682.0   210
Trade certificate             313663.0   198
University degree and higher  347222.0   198
Total size :  606

Immigrant group in Alberta
                              sum  size
Characteristics                        
Immigrant employees      326890.0   198
Non-immigrant employees  312119.0   198
Total size :  396


In [54]:
# Dataset year in 2016 inside Average Annual Hours Worked

print("\nAge group in Alberta")
df_AvgAnnHrsWrk_2016_ByAge = df_AvgAnnHrsWrk_2016.loc[
    (df_AvgAnnHrsWrk_2016['Characteristics'] == '15 to 24 years') |
    (df_AvgAnnHrsWrk_2016['Characteristics'] == '25 to 34 years') |
    (df_AvgAnnHrsWrk_2016['Characteristics'] == '35 to 44 years') |
    (df_AvgAnnHrsWrk_2016['Characteristics'] == '45 to 54 years') |
    (df_AvgAnnHrsWrk_2016['Characteristics'] == '55 to 64 years') |
    (df_AvgAnnHrsWrk_2016['Characteristics'] == '65 years old and over')]
# print(df_AvgAnnHrsWrk_2016_ByAge.head(20))
grouped = df_AvgAnnHrsWrk_2016_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2016_ByAge.index))

print("\nGender group in Alberta")
df_AvgAnnHrsWrk_2016_ByGender = df_AvgAnnHrsWrk_2016.loc[
    (df_AvgAnnHrsWrk_2016['Characteristics'] == 'Female employees') |
    (df_AvgAnnHrsWrk_2016['Characteristics'] == 'Male employees')
]
# print(df_AvgAnnHrsWrk_2016_ByGender.head(20))
grouped = df_AvgAnnHrsWrk_2016_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2016_ByGender.index))

print("\nEducation group in Alberta")
df_AvgAnnHrsWrk_2016_ByEducation = df_AvgAnnHrsWrk_2016.loc[
    (df_AvgAnnHrsWrk_2016['Characteristics'] == 'High school diploma and less') |
    (df_AvgAnnHrsWrk_2016['Characteristics'] == 'Trade certificate') |
    (df_AvgAnnHrsWrk_2016['Characteristics'] == 'University degree and higher')
]
# print(df_AvgAnnHrsWrk_2016_ByEducation.head(20))
grouped = df_AvgAnnHrsWrk_2016_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2016_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgAnnHrsWrk_2016_ByImmigrant = df_AvgAnnHrsWrk_2016.loc[
    (df_AvgAnnHrsWrk_2016['Characteristics'] == 'Immigrant employees') |
    (df_AvgAnnHrsWrk_2016['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgAnnHrsWrk_2016_ByImmigrant.head(20))
grouped = df_AvgAnnHrsWrk_2016_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2016_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgAnnHrsWrk_2016_ByIndigenous = df_AvgAnnHrsWrk_2016.loc[
#     (df_AvgAnnHrsWrk_2016['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgAnnHrsWrk_2016['Characteristics'] == 'Non-indigenous identity employees')
# ]
# print(df_AvgAnnHrsWrk_2016_ByIndigenous.head(20))
# # grouped = df_AvgAnnHrsWrk_2016_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnHrsWrk_2016_ByIndigenous.index))


Age group in Alberta
                            sum  size
Characteristics                      
15 to 24 years         176153.0   192
25 to 34 years         337302.0   210
35 to 44 years         373702.0   210
45 to 54 years         385915.0   210
55 to 64 years         350828.0   210
65 years old and over  206428.0   192
Total size :  1224

Gender group in Alberta
                       sum  size
Characteristics                 
Female employees  321687.0   210
Male employees    346749.0   210
Total size :  420

Education group in Alberta
                                   sum  size
Characteristics                             
High school diploma and less  276370.0   210
Trade certificate             307989.0   198
University degree and higher  343664.0   198
Total size :  606

Immigrant group in Alberta
                              sum  size
Characteristics                        
Immigrant employees      321592.0   198
Non-immigrant employees  310173.0   198
Total size :  396


In [55]:
# Dataset year in 2019 inside Average Annual Hours Worked

print("\nAge group in Alberta")
df_AvgAnnHrsWrk_2019_ByAge = df_AvgAnnHrsWrk_2019.loc[
    (df_AvgAnnHrsWrk_2019['Characteristics'] == '15 to 24 years') |
    (df_AvgAnnHrsWrk_2019['Characteristics'] == '25 to 34 years') |
    (df_AvgAnnHrsWrk_2019['Characteristics'] == '35 to 44 years') |
    (df_AvgAnnHrsWrk_2019['Characteristics'] == '45 to 54 years') |
    (df_AvgAnnHrsWrk_2019['Characteristics'] == '55 to 64 years') |
    (df_AvgAnnHrsWrk_2019['Characteristics'] == '65 years old and over')]
# print(df_AvgAnnHrsWrk_2019_ByAge.head(20))
grouped = df_AvgAnnHrsWrk_2019_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2019_ByAge.index))

print("\nGender group in Alberta")
df_AvgAnnHrsWrk_2019_ByGender = df_AvgAnnHrsWrk_2019.loc[
    (df_AvgAnnHrsWrk_2019['Characteristics'] == 'Female employees') |
    (df_AvgAnnHrsWrk_2019['Characteristics'] == 'Male employees')
]
# print(df_AvgAnnHrsWrk_2019_ByGender.head(20))
grouped = df_AvgAnnHrsWrk_2019_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2019_ByGender.index))

print("\nEducation group in Alberta")
df_AvgAnnHrsWrk_2019_ByEducation = df_AvgAnnHrsWrk_2019.loc[
    (df_AvgAnnHrsWrk_2019['Characteristics'] == 'High school diploma and less') |
    (df_AvgAnnHrsWrk_2019['Characteristics'] == 'Trade certificate') |
    (df_AvgAnnHrsWrk_2019['Characteristics'] == 'University degree and higher')
]
# print(df_AvgAnnHrsWrk_2019_ByEducation.head(20))
grouped = df_AvgAnnHrsWrk_2019_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2019_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgAnnHrsWrk_2019_ByImmigrant = df_AvgAnnHrsWrk_2019.loc[
    (df_AvgAnnHrsWrk_2019['Characteristics'] == 'Immigrant employees') |
    (df_AvgAnnHrsWrk_2019['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgAnnHrsWrk_2019_ByImmigrant.head(20))
grouped = df_AvgAnnHrsWrk_2019_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("Total size : ",len(df_AvgAnnHrsWrk_2019_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgAnnHrsWrk_2019_ByIndigenous = df_AvgAnnHrsWrk_2019.loc[
#     (df_AvgAnnHrsWrk_2019['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgAnnHrsWrk_2019['Characteristics'] == 'Non-indigenous identity employees')
# ]
# print(df_AvgAnnHrsWrk_2019_ByIndigenous.head(20))
# # grouped = df_AvgAnnHrkWrk_2010_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnHrsWrk_2019_ByIndigenous.index))


Age group in Alberta
                            sum  size
Characteristics                      
15 to 24 years         179872.0   192
25 to 34 years         333662.0   210
35 to 44 years         371386.0   210
45 to 54 years         384987.0   210
55 to 64 years         351843.0   210
65 years old and over  206760.0   192
Total size :  1224

Gender group in Alberta
                       sum  size
Characteristics                 
Female employees  323755.0   210
Male employees    343101.0   210
Total size :  420

Education group in Alberta
                                   sum  size
Characteristics                             
High school diploma and less  275636.0   210
Trade certificate             306174.0   198
University degree and higher  341795.0   198
Total size :  606

Immigrant group in Alberta
                              sum  size
Characteristics                        
Immigrant employees      318818.0   198
Non-immigrant employees  310245.0   198
Total size :  396


Filtered for "Average annual wages and salaries" by following: "Age group", "Gender level", "Education level", and "Immigration status".<br />
"Aboriginal status" has been commented.

In [56]:
# # Dataset year in 2010 inside Average annual wages and salaries

# print("\nAge group in Alberta")
# df_AvgAnnWages_2010_ByAge = df_AvgAnnWages_2010.loc[
#     (df_AvgAnnWages_2010['Characteristics'] == '15 to 24 years') |
#     (df_AvgAnnWages_2010['Characteristics'] == '25 to 34 years') |
#     (df_AvgAnnWages_2010['Characteristics'] == '35 to 44 years') |
#     (df_AvgAnnWages_2010['Characteristics'] == '45 to 54 years') |
#     (df_AvgAnnWages_2010['Characteristics'] == '55 to 64 years') |
#     (df_AvgAnnWages_2010['Characteristics'] == '65 years old and over')]
# # print(df_AvgAnnWages_2010_ByAge.head(20))
# grouped = df_AvgAnnWages_2010_ByAge.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnWages_2010_ByAge.index))

# print("\nGender group in Alberta")
# df_AvgAnnWages_2010_ByGender = df_AvgAnnWages_2010.loc[
#     (df_AvgAnnWages_2010['Characteristics'] == 'Female employees') |
#     (df_AvgAnnWages_2010['Characteristics'] == 'Male employees')
# ]
# # print(df_AvgAnnWages_2010_ByGender.head(20))
# grouped = df_AvgAnnWages_2010_ByGender.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnWages_2010_ByGender.index))

# print("\nEducation group in Alberta")
# df_AvgAnnWages_2010_ByEducation = df_AvgAnnWages_2010.loc[
#     (df_AvgAnnWages_2010['Characteristics'] == 'High school diploma and less') |
#     (df_AvgAnnWages_2010['Characteristics'] == 'Trade certificate') |
#     (df_AvgAnnWages_2010['Characteristics'] == 'University degree and higher')
# ]
# # print(df_AvgAnnWages_2010_ByEducation.head(20))
# grouped = df_AvgAnnWages_2010_ByEducation.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnWages_2010_ByEducation.index))

# print("\nImmigrant group in Alberta")
# df_AvgAnnWages_2010_ByImmigrant = df_AvgAnnWages_2010.loc[
#     (df_AvgAnnWages_2010['Characteristics'] == 'Immigrant employees') |
#     (df_AvgAnnWages_2010['Characteristics'] == 'Non-immigrant employees')
# ]
# # print(df_AvgAnnWages_2010_ByImmigrant.head(20))
# grouped = df_AvgAnnWages_2010_ByImmigrant.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnWages_2010_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgAnnWages_2010_ByIndigenous = df_AvgAnnWages_2010.loc[
#     (df_AvgAnnWages_2010['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgAnnWages_2010['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgAnnHrk_ByIndigenous.head(20))
# grouped = df_AvgAnnWages_2010_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnWages_2010_ByIndigenous.index))

In [57]:
# Dataset year in 2013 inside Average annual wages and salaries

print("\nAge group in Alberta")
df_AvgAnnWages_2013_ByAge = df_AvgAnnWages_2013.loc[
    (df_AvgAnnWages_2013['Characteristics'] == '15 to 24 years') |
    (df_AvgAnnWages_2013['Characteristics'] == '25 to 34 years') |
    (df_AvgAnnWages_2013['Characteristics'] == '35 to 44 years') |
    (df_AvgAnnWages_2013['Characteristics'] == '45 to 54 years') |
    (df_AvgAnnWages_2013['Characteristics'] == '55 to 64 years') |
    (df_AvgAnnWages_2013['Characteristics'] == '65 years old and over')]
# print(df_AvgAnnWages_2013_ByAge.head(20))
grouped = df_AvgAnnWages_2013_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2013_ByAge.index))

print("\nGender group in Alberta")
df_AvgAnnWages_2013_ByGender = df_AvgAnnWages_2013.loc[
    (df_AvgAnnWages_2013['Characteristics'] == 'Female employees') |
    (df_AvgAnnWages_2013['Characteristics'] == 'Male employees')
]
# print(df_AvgAnnWages_2013_ByGender.head(20))
grouped = df_AvgAnnWages_2013_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2013_ByGender.index))

print("\nEducation group in Alberta")
df_AvgAnnWages_2013_ByEducation = df_AvgAnnWages_2013.loc[
    (df_AvgAnnWages_2013['Characteristics'] == 'High school diploma and less') |
    (df_AvgAnnWages_2013['Characteristics'] == 'Trade certificate') |
    (df_AvgAnnWages_2013['Characteristics'] == 'University degree and higher')
]
# print(df_AvgAnnWages_2013_ByEducation.head(20))
grouped = df_AvgAnnWages_2013_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2013_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgAnnWages_2013_ByImmigrant = df_AvgAnnWages_2013.loc[
    (df_AvgAnnWages_2013['Characteristics'] == 'Immigrant employees') |
    (df_AvgAnnWages_2013['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgAnnWages_2013_ByImmigrant.head(20))
grouped = df_AvgAnnWages_2013_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2013_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgAnnWages_2013_ByIndigenous = df_AvgAnnWages_2013.loc[
#     (df_AvgAnnWages_2013['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgAnnWages_2013['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgAnnHrk_ByIndigenous.head(20))
# grouped = df_AvgAnnWages_2013_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnWages_2013_ByIndigenous.index))


Age group in Alberta
                              sum  size
Characteristics                        
15 to 24 years          2784616.0   192
25 to 34 years          7953813.0   210
35 to 44 years         10794086.0   210
45 to 54 years         11596540.0   210
55 to 64 years         11087673.0   210
65 years old and over   5949938.0   192
The total number of this one is  1224

Gender group in Alberta
                         sum  size
Characteristics                   
Female employees   8352873.0   210
Male employees    10560729.0   210
The total number of this one is  420

Education group in Alberta
                                     sum  size
Characteristics                               
High school diploma and less   5903675.0   210
Trade certificate              7603798.0   198
University degree and higher  11213834.0   198
The total number of this one is  606

Immigrant group in Alberta
                               sum  size
Characteristics                         
Immigran

In [58]:
# Dataset year in 2016 inside Average annual wages and salaries

print("\nAge group in Alberta")
df_AvgAnnWages_2016_ByAge = df_AvgAnnWages_2016.loc[
    (df_AvgAnnWages_2016['Characteristics'] == '15 to 24 years') |
    (df_AvgAnnWages_2016['Characteristics'] == '25 to 34 years') |
    (df_AvgAnnWages_2016['Characteristics'] == '35 to 44 years') |
    (df_AvgAnnWages_2016['Characteristics'] == '45 to 54 years') |
    (df_AvgAnnWages_2016['Characteristics'] == '55 to 64 years') |
    (df_AvgAnnWages_2016['Characteristics'] == '65 years old and over')]
# print(df_AvgAnnWages_2016_ByAge.head(20))
grouped = df_AvgAnnWages_2016_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2016_ByAge.index))

print("\nGender group in Alberta")
df_AvgAnnWages_2016_ByGender = df_AvgAnnWages_2016.loc[
    (df_AvgAnnWages_2016['Characteristics'] == 'Female employees') |
    (df_AvgAnnWages_2016['Characteristics'] == 'Male employees')
]
# print(df_AvgAnnWages_2016_ByGender.head(20))
grouped = df_AvgAnnWages_2016_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2016_ByGender.index))

print("\nEducation group in Alberta")
df_AvgAnnWages_2016_ByEducation = df_AvgAnnWages_2016.loc[
    (df_AvgAnnWages_2016['Characteristics'] == 'High school diploma and less') |
    (df_AvgAnnWages_2016['Characteristics'] == 'Trade certificate') |
    (df_AvgAnnWages_2016['Characteristics'] == 'University degree and higher')
]
# print(df_AvgAnnWages_2016_ByEducation.head(20))
grouped = df_AvgAnnWages_2016_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2016_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgAnnWages_2016_ByImmigrant = df_AvgAnnWages_2016.loc[
    (df_AvgAnnWages_2016['Characteristics'] == 'Immigrant employees') |
    (df_AvgAnnWages_2016['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgAnnWages_2016_ByImmigrant.head(20))
grouped = df_AvgAnnWages_2016_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2016_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgAnnWages_2016_ByIndigenous = df_AvgAnnWages_2016.loc[
#     (df_AvgAnnWages_2016['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgAnnWages_2016['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgAnnHrk_ByIndigenous.head(20))
# grouped = df_AvgAnnWages_2016_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnWages_2016_ByIndigenous.index))


Age group in Alberta
                              sum  size
Characteristics                        
15 to 24 years          3060129.0   192
25 to 34 years          8325679.0   210
35 to 44 years         11269609.0   210
45 to 54 years         12276948.0   210
55 to 64 years         11610535.0   210
65 years old and over   6249361.0   192
The total number of this one is  1224

Gender group in Alberta
                         sum  size
Characteristics                   
Female employees   8890302.0   210
Male employees    10887008.0   210
The total number of this one is  420

Education group in Alberta
                                     sum  size
Characteristics                               
High school diploma and less   6237814.0   210
Trade certificate              7881695.0   198
University degree and higher  11672583.0   198
The total number of this one is  606

Immigrant group in Alberta
                               sum  size
Characteristics                         
Immigran

In [59]:
# Dataset year in 2019 inside Average annual wages and salaries

print("\nAge group in Alberta")
df_AvgAnnWages_2019_ByAge = df_AvgAnnWages_2019.loc[
    (df_AvgAnnWages_2019['Characteristics'] == '15 to 24 years') |
    (df_AvgAnnWages_2019['Characteristics'] == '25 to 34 years') |
    (df_AvgAnnWages_2019['Characteristics'] == '35 to 44 years') |
    (df_AvgAnnWages_2019['Characteristics'] == '45 to 54 years') |
    (df_AvgAnnWages_2019['Characteristics'] == '55 to 64 years') |
    (df_AvgAnnWages_2019['Characteristics'] == '65 years old and over')]
# print(df_AvgAnnWages_2019_ByAge.head(20))
grouped = df_AvgAnnWages_2019_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2019_ByAge.index))

print("\nGender group in Alberta")
df_AvgAnnWages_2019_ByGender = df_AvgAnnWages_2019.loc[
    (df_AvgAnnWages_2019['Characteristics'] == 'Female employees') |
    (df_AvgAnnWages_2019['Characteristics'] == 'Male employees')
]
# print(df_AvgAnnWages_2019_ByGender.head(20))
grouped = df_AvgAnnWages_2019_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2019_ByGender.index))

print("\nEducation group in Alberta")
df_AvgAnnWages_2019_ByEducation = df_AvgAnnWages_2019.loc[
    (df_AvgAnnWages_2019['Characteristics'] == 'High school diploma and less') |
    (df_AvgAnnWages_2019['Characteristics'] == 'Trade certificate') |
    (df_AvgAnnWages_2019['Characteristics'] == 'University degree and higher')
]
# print(df_AvgAnnWages_2019_ByEducation.head(20))
grouped = df_AvgAnnWages_2019_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2019_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgAnnWages_2019_ByImmigrant = df_AvgAnnWages_2019.loc[
    (df_AvgAnnWages_2019['Characteristics'] == 'Immigrant employees') |
    (df_AvgAnnWages_2019['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgAnnWages_2019_ByImmigrant.head(20))
grouped = df_AvgAnnWages_2019_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgAnnWages_2019_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgAnnWages_2019_ByIndigenous = df_AvgAnnWages_2019.loc[
#     (df_AvgAnnWages_2019['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgAnnWages_2019['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgAnnHrk_ByIndigenous.head(20))
# grouped = df_AvgAnnWages_2019_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnWages_2019_ByIndigenous.index))


Age group in Alberta
                              sum  size
Characteristics                        
15 to 24 years          3482308.0   192
25 to 34 years          9108041.0   210
35 to 44 years         12296655.0   210
45 to 54 years         13595377.0   210
55 to 64 years         12592232.0   210
65 years old and over   6840988.0   192
The total number of this one is  1224

Gender group in Alberta
                         sum  size
Characteristics                   
Female employees   9929829.0   210
Male employees    11755264.0   210
The total number of this one is  420

Education group in Alberta
                                     sum  size
Characteristics                               
High school diploma and less   6915927.0   210
Trade certificate              8609637.0   198
University degree and higher  12581443.0   198
The total number of this one is  606

Immigrant group in Alberta
                               sum  size
Characteristics                         
Immigran

Filtered for "Average hourly wage" by following: "Age group", "Gender level", "Education level", and "Immigration status". <br />
"Aboriginal status" has been commented.

In [60]:
# # Dataset year of 2010 inside "Average hourly wage"

# print("\nAge group in Alberta")
# df_AvgHrsWages_2010_ByAge = df_AvgHrsWages_2010.loc[
#     (df_AvgHrsWages_2010['Characteristics'] == '15 to 24 years') |
#     (df_AvgHrsWages_2010['Characteristics'] == '25 to 34 years') |
#     (df_AvgHrsWages_2010['Characteristics'] == '35 to 44 years') |
#     (df_AvgHrsWages_2010['Characteristics'] == '45 to 54 years') |
#     (df_AvgHrsWages_2010['Characteristics'] == '55 to 64 years') |
#     (df_AvgHrsWages_2010['Characteristics'] == '65 years old and over')]
# # print(df_AvgHrsWages_2010_ByAge.head(20))
# grouped = df_AvgHrsWages_2010_ByAge.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgHrsWages_2010_ByAge.index))

# print("\nGender group in Alberta")
# df_AvgHrsWages_2010_ByGender = df_AvgHrsWages_2010.loc[
#     (df_AvgHrsWages_2010['Characteristics'] == 'Female employees') |
#     (df_AvgHrsWages_2010['Characteristics'] == 'Male employees')
# ]
# # print(df_AvgHrsWages_2010_ByGender.head(20))
# grouped = df_AvgHrsWages_2010_ByGender.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgHrsWages_2010_ByGender.index))

# print("\nEducation group in Alberta")
# df_AvgHrsWages_2010_ByEducation = df_AvgHrsWages_2010.loc[
#     (df_AvgHrsWages_2010['Characteristics'] == 'High school diploma and less') |
#     (df_AvgHrsWages_2010['Characteristics'] == 'Trade certificate') |
#     (df_AvgHrsWages_2010['Characteristics'] == 'University degree and higher')
# ]
# # print(df_AvgHrsWages_2010_ByEducation.head(20))
# grouped = df_AvgHrsWages_2010_ByEducation.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgHrsWages_2010_ByEducation.index))

# print("\nImmigrant group in Alberta")
# df_AvgHrsWages_2010_ByImmigrant = df_AvgHrsWages_2010.loc[
#     (df_AvgHrsWages_2010['Characteristics'] == 'Immigrant employees') |
#     (df_AvgHrsWages_2010['Characteristics'] == 'Non-immigrant employees')
# ]
# # print(df_AvgHrsWages_2010_ByImmigrant.head(20))
# grouped = df_AvgHrsWages_2010_ByImmigrant.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgHrsWages_2010_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgHrsWages_2010_ByIndigenous = df_AvgHrsWages_2010.loc[
#     (df_AvgHrsWages_2010['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgHrsWages_2010['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgHrsWages_2010_ByIndigenous.head(20))
# grouped = df_AvgHrsWages_2010_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgHrsWages_2010_ByIndigenous.index))

In [61]:
# Dataset year of 2013 inside "Average hourly wage"

print("\nAge group in Alberta")
df_AvgHrsWages_2013_ByAge = df_AvgHrsWages_2013.loc[
    (df_AvgHrsWages_2013['Characteristics'] == '15 to 24 years') |
    (df_AvgHrsWages_2013['Characteristics'] == '25 to 34 years') |
    (df_AvgHrsWages_2013['Characteristics'] == '35 to 44 years') |
    (df_AvgHrsWages_2013['Characteristics'] == '45 to 54 years') |
    (df_AvgHrsWages_2013['Characteristics'] == '55 to 64 years') |
    (df_AvgHrsWages_2013['Characteristics'] == '65 years old and over')]
# print(df_AvgHrsWages_2013_ByAge.head(20))
grouped = df_AvgHrsWages_2013_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2013_ByAge.index))

print("\nGender group in Alberta")
df_AvgHrsWages_2013_ByGender = df_AvgHrsWages_2013.loc[
    (df_AvgHrsWages_2013['Characteristics'] == 'Female employees') |
    (df_AvgHrsWages_2013['Characteristics'] == 'Male employees')
]
# print(df_AvgHrsWages_2013_ByGender.head(20))
grouped = df_AvgHrsWages_2013_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2013_ByGender.index))

print("\nEducation group in Alberta")
df_AvgHrsWages_2013_ByEducation = df_AvgHrsWages_2013.loc[
    (df_AvgHrsWages_2013['Characteristics'] == 'High school diploma and less') |
    (df_AvgHrsWages_2013['Characteristics'] == 'Trade certificate') |
    (df_AvgHrsWages_2013['Characteristics'] == 'University degree and higher')
]
# print(df_AvgHrsWages_2013_ByEducation.head(20))
grouped = df_AvgHrsWages_2013_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2013_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgHrsWages_2013_ByImmigrant = df_AvgHrsWages_2013.loc[
    (df_AvgHrsWages_2013['Characteristics'] == 'Immigrant employees') |
    (df_AvgHrsWages_2013['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgHrsWages_2013_ByImmigrant.head(20))
grouped = df_AvgHrsWages_2013_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2013_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgHrsWages_2013_ByIndigenous = df_AvgHrsWages_2013.loc[
#     (df_AvgHrsWages_2013['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgHrsWages_2013['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgHrsWages_2013_ByIndigenous.head(20))
# grouped = df_AvgHrsWages_2013_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgHrsWages_2013_ByIndigenous.index))


Age group in Alberta
                           sum  size
Characteristics                     
15 to 24 years         3089.01   192
25 to 34 years         4937.02   210
35 to 44 years         6053.85   210
45 to 54 years         6297.06   210
55 to 64 years         6605.48   210
65 years old and over  5475.51   192
The total number of this one is  1224

Gender group in Alberta
                      sum  size
Characteristics                
Female employees  5447.57   210
Male employees    6321.74   210
The total number of this one is  420

Education group in Alberta
                                  sum  size
Characteristics                            
High school diploma and less  4434.80   210
Trade certificate             4774.81   198
University degree and higher  6415.25   198
The total number of this one is  606

Immigrant group in Alberta
                             sum  size
Characteristics                       
Immigrant employees      5306.29   198
Non-immigrant employees 

In [62]:
# Dataset year in 2017 inside "Average hourly wage"

print("\nAge group in Alberta")
df_AvgHrsWages_2016_ByAge = df_AvgHrsWages_2016.loc[
    (df_AvgHrsWages_2016['Characteristics'] == '15 to 24 years') |
    (df_AvgHrsWages_2016['Characteristics'] == '25 to 34 years') |
    (df_AvgHrsWages_2016['Characteristics'] == '35 to 44 years') |
    (df_AvgHrsWages_2016['Characteristics'] == '45 to 54 years') |
    (df_AvgHrsWages_2016['Characteristics'] == '55 to 64 years') |
    (df_AvgHrsWages_2016['Characteristics'] == '65 years old and over')]
# print(df_AvgHrsWages_2016_ByAge.head(20))
grouped = df_AvgHrsWages_2016_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2016_ByAge.index))

print("\nGender group in Alberta")
df_AvgHrsWages_2016_ByGender = df_AvgHrsWages_2016.loc[
    (df_AvgHrsWages_2016['Characteristics'] == 'Female employees') |
    (df_AvgHrsWages_2016['Characteristics'] == 'Male employees')
]
# print(df_AvgHrsWages_2016_ByGender.head(20))
grouped = df_AvgHrsWages_2016_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2016_ByGender.index))

print("\nEducation group in Alberta")
df_AvgHrsWages_2016_ByEducation = df_AvgHrsWages_2016.loc[
    (df_AvgHrsWages_2016['Characteristics'] == 'High school diploma and less') |
    (df_AvgHrsWages_2016['Characteristics'] == 'Trade certificate') |
    (df_AvgHrsWages_2016['Characteristics'] == 'University degree and higher')
]
# print(df_AvgHrsWages_2016_ByEducation.head(20))
grouped = df_AvgHrsWages_2016_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2016_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgHrsWages_2016_ByImmigrant = df_AvgHrsWages_2016.loc[
    (df_AvgHrsWages_2016['Characteristics'] == 'Immigrant employees') |
    (df_AvgHrsWages_2016['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgHrsWages_2016_ByImmigrant.head(20))
grouped = df_AvgHrsWages_2016_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2016_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgAnnHrk_ByIndigenous = df_AvgHrsWages_2016.loc[
#     (df_AvgHrsWages_2016['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgHrsWages_2016['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgAnnHrk_ByIndigenous.head(20))
# grouped = df_AvgAnnHrk_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgAnnHrk_ByIndigenous.index))


Age group in Alberta
                           sum  size
Characteristics                     
15 to 24 years         3305.36   192
25 to 34 years         5214.72   210
35 to 44 years         6375.48   210
45 to 54 years         6704.45   210
55 to 64 years         6934.64   210
65 years old and over  5778.32   192
The total number of this one is  1224

Gender group in Alberta
                      sum  size
Characteristics                
Female employees  5805.97   210
Male employees    6607.99   210
The total number of this one is  420

Education group in Alberta
                                  sum  size
Characteristics                            
High school diploma and less  4707.09   210
Trade certificate             5053.86   198
University degree and higher  6739.74   198
The total number of this one is  606

Immigrant group in Alberta
                             sum  size
Characteristics                       
Immigrant employees      5560.78   198
Non-immigrant employees 

In [63]:
# Dataset year in 2019 inside "Average hourly wage"

print("\nAge group in Alberta")
df_AvgHrsWages_2019_ByAge = df_AvgHrsWages_2019.loc[
    (df_AvgHrsWages_2019['Characteristics'] == '15 to 24 years') |
    (df_AvgHrsWages_2019['Characteristics'] == '25 to 34 years') |
    (df_AvgHrsWages_2019['Characteristics'] == '35 to 44 years') |
    (df_AvgHrsWages_2019['Characteristics'] == '45 to 54 years') |
    (df_AvgHrsWages_2019['Characteristics'] == '55 to 64 years') |
    (df_AvgHrsWages_2019['Characteristics'] == '65 years old and over')]
# print(df_AvgHrsWages_2019_ByAge.head(20))
grouped = df_AvgHrsWages_2019_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2019_ByAge.index))

print("\nGender group in Alberta")
df_AvgHrsWages_2019_ByGender = df_AvgHrsWages_2019.loc[
    (df_AvgHrsWages_2019['Characteristics'] == 'Female employees') |
    (df_AvgHrsWages_2019['Characteristics'] == 'Male employees')
]
# print(df_AvgHrsWages_2019_ByGender.head(20))
grouped = df_AvgHrsWages_2019_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2019_ByGender.index))

print("\nEducation group in Alberta")
df_AvgHrsWages_2019_ByEducation = df_AvgHrsWages_2019.loc[
    (df_AvgHrsWages_2019['Characteristics'] == 'High school diploma and less') |
    (df_AvgHrsWages_2019['Characteristics'] == 'Trade certificate') |
    (df_AvgHrsWages_2019['Characteristics'] == 'University degree and higher')
]
# print(df_AvgHrsWages_2019_ByEducation.head(20))
grouped = df_AvgHrsWages_2019_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2019_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgHrsWages_2019_ByImmigrant = df_AvgHrsWages_2019.loc[
    (df_AvgHrsWages_2019['Characteristics'] == 'Immigrant employees') |
    (df_AvgHrsWages_2019['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgHrsWages_2019_ByImmigrant.head(20))
grouped = df_AvgHrsWages_2019_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgHrsWages_2019_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgHrsWages_2019_ByIndigenous = df_AvgHrsWages_2019.loc[
#     (df_AvgHrsWages_2019['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgHrsWages_2019['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgHrsWages_2019_ByIndigenous.head(20))
# grouped = df_AvgHrsWages_2019_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgHrsWages_2019_ByIndigenous.index))


Age group in Alberta
                           sum  size
Characteristics                     
15 to 24 years         3689.12   192
25 to 34 years         5759.41   210
35 to 44 years         6991.24   210
45 to 54 years         7451.95   210
55 to 64 years         7475.92   210
65 years old and over  6329.58   192
The total number of this one is  1224

Gender group in Alberta
                      sum  size
Characteristics                
Female employees  6439.32   210
Male employees    7203.15   210
The total number of this one is  420

Education group in Alberta
                                  sum  size
Characteristics                            
High school diploma and less  5242.44   210
Trade certificate             5569.30   198
University degree and higher  7296.08   198
The total number of this one is  606

Immigrant group in Alberta
                             sum  size
Characteristics                       
Immigrant employees      6065.65   198
Non-immigrant employees 

Filtered for "Average weekly hours worked" by following: "Age group", "Gender level", "Education level", and "Immigration status".<br />
"Aboriginal status" has been commented.

In [64]:
# # Dataset year in 2010 inside "Average weekly hours worked"

# print("\nAge group in Alberta")
# df_AvgWeekHrsWrked_2010_ByAge = df_AvgWeekHrsWrked_2010.loc[
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == '15 to 24 years') |
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == '25 to 34 years') |
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == '35 to 44 years') |
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == '45 to 54 years') |
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == '55 to 64 years') |
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == '65 years old and over')]
# # print(df_AvgWeekHrsWrked_2010_ByAge.head(20))
# grouped = df_AvgWeekHrsWrked_2010_ByAge.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgWeekHrsWrked_2010_ByAge.index))

# print("\nGender group in Alberta")
# df_AvgWeekHrsWrked_2010_ByGender = df_AvgWeekHrsWrked_2010.loc[
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == 'Female employees') |
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == 'Male employees')
# ]
# # print(df_AvgWeekHrsWrked_2010_ByGender.head(20))
# grouped = df_AvgWeekHrsWrked_2010_ByGender.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgWeekHrsWrked_2010_ByGender.index))

# print("\nEducation group in Alberta")
# df_AvgWeekHrsWrked_2010_ByEducation = df_AvgWeekHrsWrked_2010.loc[
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == 'High school diploma and less') |
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == 'Trade certificate') |
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == 'University degree and higher')
# ]
# # print(df_AvgWeekHrsWrked_2010_ByEducation.head(20))
# grouped = df_AvgWeekHrsWrked_2010_ByEducation.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgWeekHrsWrked_2010_ByEducation.index))

# print("\nImmigrant group in Alberta")
# df_AvgWeekHrsWrked_2010_ByImmigrant = df_AvgWeekHrsWrked_2010.loc[
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == 'Immigrant employees') |
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == 'Non-immigrant employees')
# ]
# # print(df_AvgWeekHrsWrked_2010_ByImmigrant.head(20))
# grouped = df_AvgWeekHrsWrked_2010_ByImmigrant.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgWeekHrsWrked_2010_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgWeekHrsWrked_2010_ByIndigenous = df_AvgWeekHrsWrked_2010.loc[
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgWeekHrsWrked_2010['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgWeekHrsWrked_2010_ByIndigenous.head(20))
# grouped = df_AvgWeekHrsWrked_2010_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgWeekHrsWrked_2010_ByIndigenous.index))

In [65]:
# Dataset year in 2013 inside "Average weekly hours worked"

print("\nAge group in Alberta")
df_AvgWeekHrsWrked_2013_ByAge = df_AvgWeekHrsWrked_2013.loc[
    (df_AvgWeekHrsWrked_2013['Characteristics'] == '15 to 24 years') |
    (df_AvgWeekHrsWrked_2013['Characteristics'] == '25 to 34 years') |
    (df_AvgWeekHrsWrked_2013['Characteristics'] == '35 to 44 years') |
    (df_AvgWeekHrsWrked_2013['Characteristics'] == '45 to 54 years') |
    (df_AvgWeekHrsWrked_2013['Characteristics'] == '55 to 64 years') |
    (df_AvgWeekHrsWrked_2013['Characteristics'] == '65 years old and over')]
# print(df_AvgWeekHrsWrked_2013_ByAge.head(20))
grouped = df_AvgWeekHrsWrked_2013_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2013_ByAge.index))

print("\nGender group in Alberta")
df_AvgWeekHrsWrked_2013_ByGender = df_AvgWeekHrsWrked_2013.loc[
    (df_AvgWeekHrsWrked_2013['Characteristics'] == 'Female employees') |
    (df_AvgWeekHrsWrked_2013['Characteristics'] == 'Male employees')
]
# print(df_AvgWeekHrsWrked_2013_ByGender.head(20))
grouped = df_AvgWeekHrsWrked_2013_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2013_ByGender.index))

print("\nEducation group in Alberta")
df_AvgWeekHrsWrked_2013_ByEducation = df_AvgWeekHrsWrked_2013.loc[
    (df_AvgWeekHrsWrked_2013['Characteristics'] == 'High school diploma and less') |
    (df_AvgWeekHrsWrked_2013['Characteristics'] == 'Trade certificate') |
    (df_AvgWeekHrsWrked_2013['Characteristics'] == 'University degree and higher')
]
# print(df_AvgWeekHrsWrked_2013_ByEducation.head(20))
grouped = df_AvgWeekHrsWrked_2013_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2013_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgWeekHrsWrked_2013_ByImmigrant = df_AvgWeekHrsWrked_2013.loc[
    (df_AvgWeekHrsWrked_2013['Characteristics'] == 'Immigrant employees') |
    (df_AvgWeekHrsWrked_2013['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgWeekHrsWrked_2013_ByImmigrant.head(20))
grouped = df_AvgWeekHrsWrked_2013_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2013_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgWeekHrsWrked_2013_ByIndigenous = df_AvgWeekHrsWrked_2013.loc[
#     (df_AvgWeekHrsWrked_2013['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgWeekHrsWrked_2013['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgWeekHrsWrked_2013_ByIndigenous.head(20))
# grouped = df_AvgWeekHrsWrked_2013_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgWeekHrsWrked_2013_ByIndigenous.index))


Age group in Alberta
                          sum  size
Characteristics                    
15 to 24 years         3314.0   192
25 to 34 years         6556.0   210
35 to 44 years         7256.0   210
45 to 54 years         7454.0   210
55 to 64 years         6780.0   210
65 years old and over  3983.0   192
The total number of this one is  1224

Gender group in Alberta
                     sum  size
Characteristics               
Female employees  6196.0   210
Male employees    6768.0   210
The total number of this one is  420

Education group in Alberta
                                 sum  size
Characteristics                           
High school diploma and less  5334.0   210
Trade certificate             6034.0   198
University degree and higher  6681.0   198
The total number of this one is  606

Immigrant group in Alberta
                            sum  size
Characteristics                      
Immigrant employees      6278.0   198
Non-immigrant employees  6001.0   198
The to

In [66]:
# Dataset year in 2016 inside "Average weekly hours worked"

print("\nAge group in Alberta")
df_AvgWeekHrsWrked_2016_ByAge = df_AvgWeekHrsWrked_2016.loc[
    (df_AvgWeekHrsWrked_2016['Characteristics'] == '15 to 24 years') |
    (df_AvgWeekHrsWrked_2016['Characteristics'] == '25 to 34 years') |
    (df_AvgWeekHrsWrked_2016['Characteristics'] == '35 to 44 years') |
    (df_AvgWeekHrsWrked_2016['Characteristics'] == '45 to 54 years') |
    (df_AvgWeekHrsWrked_2016['Characteristics'] == '55 to 64 years') |
    (df_AvgWeekHrsWrked_2016['Characteristics'] == '65 years old and over')]
# print(df_AvgWeekHrsWrked_2016_ByAge.head(20))
grouped = df_AvgWeekHrsWrked_2016_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2016_ByAge.index))

print("\nGender group in Alberta")
df_AvgWeekHrsWrked_2016_ByGender = df_AvgWeekHrsWrked_2016.loc[
    (df_AvgWeekHrsWrked_2016['Characteristics'] == 'Female employees') |
    (df_AvgWeekHrsWrked_2016['Characteristics'] == 'Male employees')
]
# print(df_AvgWeekHrsWrked_2016_ByGender.head(20))
grouped = df_AvgWeekHrsWrked_2016_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2016_ByGender.index))

print("\nEducation group in Alberta")
df_AvgWeekHrsWrked_2016_ByEducation = df_AvgWeekHrsWrked_2016.loc[
    (df_AvgWeekHrsWrked_2016['Characteristics'] == 'High school diploma and less') |
    (df_AvgWeekHrsWrked_2016['Characteristics'] == 'Trade certificate') |
    (df_AvgWeekHrsWrked_2016['Characteristics'] == 'University degree and higher')
]
# print(df_AvgWeekHrsWrked_2016_ByEducation.head(20))
grouped = df_AvgWeekHrsWrked_2016_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2016_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgWeekHrsWrked_2016_ByImmigrant = df_AvgWeekHrsWrked_2016.loc[
    (df_AvgWeekHrsWrked_2016['Characteristics'] == 'Immigrant employees') |
    (df_AvgWeekHrsWrked_2016['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgWeekHrsWrked_2016_ByImmigrant.head(20))
grouped = df_AvgWeekHrsWrked_2016_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2016_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgWeekHrsWrked_2016_ByIndigenous = df_AvgWeekHrsWrked_2016.loc[
#     (df_AvgWeekHrsWrked_2016['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgWeekHrsWrked_2016['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgAnnHrk_ByIndigenous.head(20))
# grouped = df_AvgWeekHrsWrked_2016_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgWeekHrsWrked_2016_ByIndigenous.index))


Age group in Alberta
                          sum  size
Characteristics                    
15 to 24 years         3387.0   192
25 to 34 years         6489.0   210
35 to 44 years         7185.0   210
45 to 54 years         7415.0   210
55 to 64 years         6748.0   210
65 years old and over  3965.0   192
The total number of this one is  1224

Gender group in Alberta
                     sum  size
Characteristics               
Female employees  6184.0   210
Male employees    6668.0   210
The total number of this one is  420

Education group in Alberta
                                 sum  size
Characteristics                           
High school diploma and less  5315.0   210
Trade certificate             5922.0   198
University degree and higher  6604.0   198
The total number of this one is  606

Immigrant group in Alberta
                            sum  size
Characteristics                      
Immigrant employees      6185.0   198
Non-immigrant employees  5968.0   198
The to

In [67]:
# Dataset year in 2019 inside "Average weekly hours worked"

print("\nAge group in Alberta")
df_AvgWeekHrsWrked_2019_ByAge = df_AvgWeekHrsWrked_2019.loc[
    (df_AvgWeekHrsWrked_2019['Characteristics'] == '15 to 24 years') |
    (df_AvgWeekHrsWrked_2019['Characteristics'] == '25 to 34 years') |
    (df_AvgWeekHrsWrked_2019['Characteristics'] == '35 to 44 years') |
    (df_AvgWeekHrsWrked_2019['Characteristics'] == '45 to 54 years') |
    (df_AvgWeekHrsWrked_2019['Characteristics'] == '55 to 64 years') |
    (df_AvgWeekHrsWrked_2019['Characteristics'] == '65 years old and over')]
# print(df_AvgWeekHrsWrked_2019_ByAge.head(20))
grouped = df_AvgWeekHrsWrked_2019_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2019_ByAge.index))

print("\nGender group in Alberta")
df_AvgWeekHrsWrked_2019_ByGender = df_AvgWeekHrsWrked_2019.loc[
    (df_AvgWeekHrsWrked_2019['Characteristics'] == 'Female employees') |
    (df_AvgWeekHrsWrked_2019['Characteristics'] == 'Male employees')
]
# print(df_AvgWeekHrsWrked_2019_ByGender.head(20))
grouped = df_AvgWeekHrsWrked_2019_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2019_ByGender.index))

print("\nEducation group in Alberta")
df_AvgWeekHrsWrked_2019_ByEducation = df_AvgWeekHrsWrked_2019.loc[
    (df_AvgWeekHrsWrked_2019['Characteristics'] == 'High school diploma and less') |
    (df_AvgWeekHrsWrked_2019['Characteristics'] == 'Trade certificate') |
    (df_AvgWeekHrsWrked_2019['Characteristics'] == 'University degree and higher')
]
# print(df_AvgWeekHrsWrked_2019_ByEducation.head(20))
grouped = df_AvgWeekHrsWrked_2019_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2019_ByEducation.index))

print("\nImmigrant group in Alberta")
df_AvgWeekHrsWrked_2019_ByImmigrant = df_AvgWeekHrsWrked_2019.loc[
    (df_AvgWeekHrsWrked_2019['Characteristics'] == 'Immigrant employees') |
    (df_AvgWeekHrsWrked_2019['Characteristics'] == 'Non-immigrant employees')
]
# print(df_AvgWeekHrsWrked_2019_ByImmigrant.head(20))
grouped = df_AvgWeekHrsWrked_2019_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_AvgWeekHrsWrked_2019_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_AvgWeekHrsWrked_2019_ByIndigenous = df_AvgWeekHrsWrked_2019.loc[
#     (df_AvgWeekHrsWrked_2019['Characteristics'] == 'Indigenous identity employees') |
#     (df_AvgWeekHrsWrked_2019['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgWeekHrsWrked_2019_ByIndigenous.head(20))
# grouped = df_AvgWeekHrsWrked_2019_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_AvgWeekHrsWrked_2019_ByIndigenous.index))


Age group in Alberta
                          sum  size
Characteristics                    
15 to 24 years         3465.0   192
25 to 34 years         6413.0   210
35 to 44 years         7139.0   210
45 to 54 years         7406.0   210
55 to 64 years         6765.0   210
65 years old and over  3969.0   192
The total number of this one is  1224

Gender group in Alberta
                     sum  size
Characteristics               
Female employees  6219.0   210
Male employees    6590.0   210
The total number of this one is  420

Education group in Alberta
                                 sum  size
Characteristics                           
High school diploma and less  5303.0   210
Trade certificate             5889.0   198
University degree and higher  6568.0   198
The total number of this one is  606

Immigrant group in Alberta
                            sum  size
Characteristics                      
Immigrant employees      6133.0   198
Non-immigrant employees  5961.0   198
The to

Filtered for "Hours worked" by following: "Age group", "Gender level", "Education level", and "Immigration status".<br />
"Aboriginal status" has been commented.


In [68]:
# # Dataset year in 2010 inside "Hours Worked"

# print("\nAge group in Alberta")
# df_Hrs_Wrked_2010_ByAge = df_Hrs_Wrked_2010.loc[
#     (df_Hrs_Wrked_2010['Characteristics'] == '15 to 24 years') |
#     (df_Hrs_Wrked_2010['Characteristics'] == '25 to 34 years') |
#     (df_Hrs_Wrked_2010['Characteristics'] == '35 to 44 years') |
#     (df_Hrs_Wrked_2010['Characteristics'] == '45 to 54 years') |
#     (df_Hrs_Wrked_2010['Characteristics'] == '55 to 64 years') |
#     (df_Hrs_Wrked_2010['Characteristics'] == '65 years old and over')]
# # print(df_Hrs_Wrked_2010_ByAge.head(20))
# grouped = df_Hrs_Wrked_2010_ByAge.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_Hrs_Wrked_2010_ByAge.index))

# print("\nGender group in Alberta")
# df_Hrs_Wrked_2010_ByGender = df_Hrs_Wrked_2010.loc[
#     (df_Hrs_Wrked_2010['Characteristics'] == 'Female employees') |
#     (df_Hrs_Wrked_2010['Characteristics'] == 'Male employees')
# ]
# # print(df_Hrs_Wrked_2010_ByGender.head(20))
# grouped = df_Hrs_Wrked_2010_ByGender.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_Hrs_Wrked_2010_ByGender.index))

# print("\nEducation group in Alberta")
# df_Hrs_Wrked_2010_ByEducation = df_Hrs_Wrked_2010.loc[
#     (df_Hrs_Wrked_2010['Characteristics'] == 'High school diploma and less') |
#     (df_Hrs_Wrked_2010['Characteristics'] == 'Trade certificate') |
#     (df_Hrs_Wrked_2010['Characteristics'] == 'University degree and higher')
# ]
# # print(df_Hrs_Wrked_2010_ByEducation.head(20))
# grouped = df_Hrs_Wrked_2010_ByEducation.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_Hrs_Wrked_2010_ByEducation.index))

# print("\nImmigrant group in Alberta")
# df_Hrs_Wrked_2010_ByImmigrant = df_Hrs_Wrked_2010.loc[
#     (df_Hrs_Wrked_2010['Characteristics'] == 'Immigrant employees') |
#     (df_Hrs_Wrked_2010['Characteristics'] == 'Non-immigrant employees')
# ]
# # print(df_Hrs_Wrked_2010_ByImmigrant.head(20))
# grouped = df_Hrs_Wrked_2010_ByImmigrant.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_Hrs_Wrked_2010_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_Hrs_Wrked_2010_ByIndigenous = df_Hrs_Wrked_2010.loc[
#     (df_Hrs_Wrked_2010['Characteristics'] == 'Indigenous identity employees') |
#     (df_Hrs_Wrked_2010['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_Hrs_Wrked_2010_ByIndigenous.head(20))
# grouped = df_Hrs_Wrked_2010_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_Hrs_Wrked_2010_ByIndigenous.index))

In [69]:
# Dataset year in 2013 inside "Hours Worked"

print("\nAge group in Alberta")
df_Hrs_Wrked_2013_ByAge = df_Hrs_Wrked_2013.loc[
    (df_Hrs_Wrked_2013['Characteristics'] == '15 to 24 years') |
    (df_Hrs_Wrked_2013['Characteristics'] == '25 to 34 years') |
    (df_Hrs_Wrked_2013['Characteristics'] == '35 to 44 years') |
    (df_Hrs_Wrked_2013['Characteristics'] == '45 to 54 years') |
    (df_Hrs_Wrked_2013['Characteristics'] == '55 to 64 years') |
    (df_Hrs_Wrked_2013['Characteristics'] == '65 years old and over')]
# print(df_Hrs_Wrked_2013_ByAge.head(20))
grouped = df_Hrs_Wrked_2013_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2013_ByAge.index))

print("\nGender group in Alberta")
df_Hrs_Wrked_2013_ByGender = df_Hrs_Wrked_2013.loc[
    (df_Hrs_Wrked_2013['Characteristics'] == 'Female employees') |
    (df_Hrs_Wrked_2013['Characteristics'] == 'Male employees')
]
# print(df_Hrs_Wrked_2013_ByGender.head(20))
grouped = df_Hrs_Wrked_2013_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2013_ByGender.index))

print("\nEducation group in Alberta")
df_Hrs_Wrked_2013_ByEducation = df_Hrs_Wrked_2013.loc[
    (df_Hrs_Wrked_2013['Characteristics'] == 'High school diploma and less') |
    (df_Hrs_Wrked_2013['Characteristics'] == 'Trade certificate') |
    (df_Hrs_Wrked_2013['Characteristics'] == 'University degree and higher')
]
# print(df_Hrs_Wrked_2013_ByEducation.head(20))
grouped = df_Hrs_Wrked_2013_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2013_ByEducation.index))

print("\nImmigrant group in Alberta")
df_Hrs_Wrked_2013_ByImmigrant = df_Hrs_Wrked_2013.loc[
    (df_Hrs_Wrked_2013['Characteristics'] == 'Immigrant employees') |
    (df_Hrs_Wrked_2013['Characteristics'] == 'Non-immigrant employees')
]
# print(df_Hrs_Wrked_2013_ByImmigrant.head(20))
grouped = df_Hrs_Wrked_2013_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2013_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_Hrs_Wrked_2013_ByIndigenous = df_Hrs_Wrked_2013.loc[
#     (df_Hrs_Wrked_2013['Characteristics'] == 'Indigenous identity employees') |
#     (df_Hrs_Wrked_2013['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_Hrs_Wrked_2013_ByIndigenous.head(20))
# grouped = df_Hrs_Wrked_2013_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_Hrs_Wrked_2013_ByIndigenous.index))


Age group in Alberta
                              sum  size
Characteristics                        
15 to 24 years          3107754.0   192
25 to 34 years         10595996.0   210
35 to 44 years         11379938.0   210
45 to 54 years         13158896.0   210
55 to 64 years          9899434.0   210
65 years old and over   1974184.0   192
The total number of this one is  1224

Gender group in Alberta
                         sum  size
Characteristics                   
Female employees  34086022.0   210
Male employees    16032036.0   210
The total number of this one is  420

Education group in Alberta
                                     sum  size
Characteristics                               
High school diploma and less   9259138.0   210
Trade certificate              3736455.0   198
University degree and higher  23601210.0   198
The total number of this one is  606

Immigrant group in Alberta
                                sum  size
Characteristics                          
Immigr

In [70]:
# Dataset year in 2016 inside "Hours Worked"

print("\nAge group in Alberta")
df_Hrs_Wrked_2016_ByAge = df_Hrs_Wrked_2016.loc[
    (df_Hrs_Wrked_2016['Characteristics'] == '15 to 24 years') |
    (df_Hrs_Wrked_2016['Characteristics'] == '25 to 34 years') |
    (df_Hrs_Wrked_2016['Characteristics'] == '35 to 44 years') |
    (df_Hrs_Wrked_2016['Characteristics'] == '45 to 54 years') |
    (df_Hrs_Wrked_2016['Characteristics'] == '55 to 64 years') |
    (df_Hrs_Wrked_2016['Characteristics'] == '65 years old and over')]
# print(df_Hrs_Wrked_2016_ByAge.head(20))
grouped = df_Hrs_Wrked_2016_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2016_ByAge.index))

print("\nGender group in Alberta")
df_Hrs_Wrked_2016_ByGender = df_Hrs_Wrked_2016.loc[
    (df_Hrs_Wrked_2016['Characteristics'] == 'Female employees') |
    (df_Hrs_Wrked_2016['Characteristics'] == 'Male employees')
]
# print(df_Hrs_Wrked_2016_ByGender.head(20))
grouped = df_Hrs_Wrked_2016_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2016_ByGender.index))

print("\nEducation group in Alberta")
df_Hrs_Wrked_2016_ByEducation = df_Hrs_Wrked_2016.loc[
    (df_Hrs_Wrked_2016['Characteristics'] == 'High school diploma and less') |
    (df_Hrs_Wrked_2016['Characteristics'] == 'Trade certificate') |
    (df_Hrs_Wrked_2016['Characteristics'] == 'University degree and higher')
]
# print(df_Hrs_Wrked_2016_ByEducation.head(20))
grouped = df_Hrs_Wrked_2016_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2016_ByEducation.index))

print("\nImmigrant group in Alberta")
df_Hrs_Wrked_2016_ByImmigrant = df_Hrs_Wrked_2016.loc[
    (df_Hrs_Wrked_2016['Characteristics'] == 'Immigrant employees') |
    (df_Hrs_Wrked_2016['Characteristics'] == 'Non-immigrant employees')
]
# print(df_Hrs_Wrked_2016_ByImmigrant.head(20))
grouped = df_Hrs_Wrked_2016_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2016_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_Hrs_Wrked_2016_ByIndigenous = df_Hrs_Wrked_2016.loc[
#     (df_Hrs_Wrked_2016['Characteristics'] == 'Indigenous identity employees') |
#     (df_Hrs_Wrked_2016['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_AvgAnnHrk_ByIndigenous.head(20))
# grouped = df_Hrs_Wrked_2016_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_Hrs_Wrked_2016_ByIndigenous.index))


Age group in Alberta
                              sum  size
Characteristics                        
15 to 24 years          3211600.0   192
25 to 34 years         11296322.0   210
35 to 44 years         12146992.0   210
45 to 54 years         13070232.0   210
55 to 64 years         10422244.0   210
65 years old and over   2324345.0   192
The total number of this one is  1224

Gender group in Alberta
                         sum  size
Characteristics                   
Female employees  35855312.0   210
Male employees    16618598.0   210
The total number of this one is  420

Education group in Alberta
                                     sum  size
Characteristics                               
High school diploma and less   9485452.0   210
Trade certificate              3558304.0   198
University degree and higher  25474813.0   198
The total number of this one is  606

Immigrant group in Alberta
                                sum  size
Characteristics                          
Immigr

In [71]:
# Dataset year in 2019 inside "Hour Worked"

print("\nAge group in Alberta")
df_Hrs_Wrked_2019_ByAge = df_Hrs_Wrked_2019.loc[
    (df_Hrs_Wrked_2019['Characteristics'] == '15 to 24 years') |
    (df_Hrs_Wrked_2019['Characteristics'] == '25 to 34 years') |
    (df_Hrs_Wrked_2019['Characteristics'] == '35 to 44 years') |
    (df_Hrs_Wrked_2019['Characteristics'] == '45 to 54 years') |
    (df_Hrs_Wrked_2019['Characteristics'] == '55 to 64 years') |
    (df_Hrs_Wrked_2019['Characteristics'] == '65 years old and over')]
# print(df_Hrs_Wrked_2019_ByAge.head(20))
grouped = df_Hrs_Wrked_2019_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2019_ByAge.index))

print("\nGender group in Alberta")
df_Hrs_Wrked_2019_ByGender = df_Hrs_Wrked_2019.loc[
    (df_Hrs_Wrked_2019['Characteristics'] == 'Female employees') |
    (df_Hrs_Wrked_2019['Characteristics'] == 'Male employees')
]
# print(df_Hrs_Wrked_2019_ByGender.head(20))
grouped = df_Hrs_Wrked_2019_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2019_ByGender.index))

print("\nEducation group in Alberta")
df_Hrs_Wrked_2019_ByEducation = df_Hrs_Wrked_2019.loc[
    (df_Hrs_Wrked_2019['Characteristics'] == 'High school diploma and less') |
    (df_Hrs_Wrked_2019['Characteristics'] == 'Trade certificate') |
    (df_Hrs_Wrked_2019['Characteristics'] == 'University degree and higher')
]
# print(df_Hrs_Wrked_2019_ByEducation.head(20))
grouped = df_Hrs_Wrked_2019_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2019_ByEducation.index))

print("\nImmigrant group in Alberta")
df_Hrs_Wrked_2019_ByImmigrant = df_Hrs_Wrked_2019.loc[
    (df_Hrs_Wrked_2019['Characteristics'] == 'Immigrant employees') |
    (df_Hrs_Wrked_2019['Characteristics'] == 'Non-immigrant employees')
]
# print(df_Hrs_Wrked_2019_ByImmigrant.head(20))
grouped = df_Hrs_Wrked_2019_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_Hrs_Wrked_2019_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_Hrs_Wrked_2019_ByIndigenous = df_Hrs_Wrked_2019.loc[
#     (df_Hrs_Wrked_2019['Characteristics'] == 'Indigenous identity employees') |
#     (df_Hrs_Wrked_2019['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_Hrs_Wrked_2019_ByIndigenous.head(20))
# grouped = df_Hrs_Wrked_2019_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_Hrs_Wrked_2019_ByIndigenous.index))



Age group in Alberta
                              sum  size
Characteristics                        
15 to 24 years          3132171.0   192
25 to 34 years         11857576.0   210
35 to 44 years         12919606.0   210
45 to 54 years         12974580.0   210
55 to 64 years         10690620.0   210
65 years old and over   2528517.0   192
The total number of this one is  1224

Gender group in Alberta
                         sum  size
Characteristics                   
Female employees  37193214.0   210
Male employees    16912308.0   210
The total number of this one is  420

Education group in Alberta
                                     sum  size
Characteristics                               
High school diploma and less   9285900.0   210
Trade certificate              3449203.0   198
University degree and higher  27393105.0   198
The total number of this one is  606

Immigrant group in Alberta
                                sum  size
Characteristics                          
Immigr

Filtered for "Number of jobs" by following: "Age group", "Gender level", "Education level", and "Immigration status".<br />
"Aboriginal status" has been commented.

In [72]:
# # Dataset year in 2010 inside "Number of jobs"

# print("\nAge group in Alberta")
# df_NumOfJob_2010_ByAge = df_NumOfJob_2010.loc[
#     (df_NumOfJob_2010['Characteristics'] == '15 to 24 years') |
#     (df_NumOfJob_2010['Characteristics'] == '25 to 34 years') |
#     (df_NumOfJob_2010['Characteristics'] == '35 to 44 years') |
#     (df_NumOfJob_2010['Characteristics'] == '45 to 54 years') |
#     (df_NumOfJob_2010['Characteristics'] == '55 to 64 years') |
#     (df_NumOfJob_2010['Characteristics'] == '65 years old and over')]
# # print(df_NumOfJob_2010_ByAge.head(20))
# grouped = df_NumOfJob_2010_ByAge.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_NumOfJob_2010_ByAge.index))

# print("\nGender group in Alberta")
# df_NumOfJob_2010_ByGender = df_NumOfJob_2010.loc[
#     (df_NumOfJob_2010['Characteristics'] == 'Female employees') |
#     (df_NumOfJob_2010['Characteristics'] == 'Male employees')
# ]
# # print(df_NumOfJob_2010_ByGender.head(20))
# grouped = df_NumOfJob_2010_ByGender.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_NumOfJob_2010_ByGender.index))

# print("\nEducation group in Alberta")
# df_NumOfJob_2010_ByEducation = df_NumOfJob_2010.loc[
#     (df_NumOfJob_2010['Characteristics'] == 'High school diploma and less') |
#     (df_NumOfJob_2010['Characteristics'] == 'Trade certificate') |
#     (df_NumOfJob_2010['Characteristics'] == 'University degree and higher')
# ]
# # print(df_NumOfJob_2010_ByEducation.head(20))
# grouped = df_NumOfJob_2010_ByEducation.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_NumOfJob_2010_ByEducation.index))

# print("\nImmigrant group in Alberta")
# df_NumOfJob_2010_ByImmigrant = df_NumOfJob_2010.loc[
#     (df_NumOfJob_2010['Characteristics'] == 'Immigrant employees') |
#     (df_NumOfJob_2010['Characteristics'] == 'Non-immigrant employees')
# ]
# # print(df_NumOfJob_2010_ByImmigrant.head(20))
# grouped = df_NumOfJob_2010_ByImmigrant.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_NumOfJob_2010_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_NumOfJob_2010_ByIndigenous = df_NumOfJob_2010.loc[
#     (df_NumOfJob_2010['Characteristics'] == 'Indigenous identity employees') |
#     (df_NumOfJob_2010['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_NumOfJob_2010_ByIndigenous.head(20))
# grouped = df_NumOfJob_2010_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_NumOfJob_2010_ByIndigenous.index))

In [73]:
# Dataset year in 2013 inside "Number of jobs"

print("\nAge group in Alberta")
df_NumOfJob_2013_ByAge = df_NumOfJob_2013.loc[
    (df_NumOfJob_2013['Characteristics'] == '15 to 24 years') |
    (df_NumOfJob_2013['Characteristics'] == '25 to 34 years') |
    (df_NumOfJob_2013['Characteristics'] == '35 to 44 years') |
    (df_NumOfJob_2013['Characteristics'] == '45 to 54 years') |
    (df_NumOfJob_2013['Characteristics'] == '55 to 64 years') |
    (df_NumOfJob_2013['Characteristics'] == '65 years old and over')]
# print(df_NumOfJob_2013_ByAge.head(20))
grouped = df_NumOfJob_2013_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2013_ByAge.index))

print("\nGender group in Alberta")
df_NumOfJob_2013_ByGender = df_NumOfJob_2013.loc[
    (df_NumOfJob_2013['Characteristics'] == 'Female employees') |
    (df_NumOfJob_2013['Characteristics'] == 'Male employees')
]
# print(df_NumOfJob_2013_ByGender.head(20))
grouped = df_NumOfJob_2013_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2013_ByGender.index))

print("\nEducation group in Alberta")
df_NumOfJob_2013_ByEducation = df_NumOfJob_2013.loc[
    (df_NumOfJob_2013['Characteristics'] == 'High school diploma and less') |
    (df_NumOfJob_2013['Characteristics'] == 'Trade certificate') |
    (df_NumOfJob_2013['Characteristics'] == 'University degree and higher')
]
# print(df_NumOfJob_2013_ByEducation.head(20))
grouped = df_NumOfJob_2013_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2013_ByEducation.index))

print("\nImmigrant group in Alberta")
df_NumOfJob_2013_ByImmigrant = df_NumOfJob_2013.loc[
    (df_NumOfJob_2013['Characteristics'] == 'Immigrant employees') |
    (df_NumOfJob_2013['Characteristics'] == 'Non-immigrant employees')
]
# print(df_NumOfJob_2013_ByImmigrant.head(20))
grouped = df_NumOfJob_2013_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2013_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_NumOfJob_2013_ByIndigenous = df_NumOfJob_2013.loc[
#     (df_NumOfJob_2013['Characteristics'] == 'Indigenous identity employees') |
#     (df_NumOfJob_2013['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_NumOfJob_2013_ByIndigenous.head(20))
# grouped = df_NumOfJob_2013_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_NumOfJob_2013_ByIndigenous.index))


Age group in Alberta
                             sum  size
Characteristics                       
15 to 24 years         3475038.0   192
25 to 34 years         6767092.0   210
35 to 44 years         6596518.0   210
45 to 54 years         7311054.0   210
55 to 64 years         6008488.0   210
65 years old and over  1910168.0   192
The total number of this one is  1224

Gender group in Alberta
                         sum  size
Characteristics                   
Female employees  22196810.0   210
Male employees     9873690.0   210
The total number of this one is  420

Education group in Alberta
                                     sum  size
Characteristics                               
High school diploma and less   7081390.0   210
Trade certificate              2417879.0   198
University degree and higher  14192344.0   198
The total number of this one is  606

Immigrant group in Alberta
                                sum  size
Characteristics                          
Immigrant empl

In [74]:
# Dataset year in 2016 inside "Number of jobs"

print("\nAge group in Alberta")
df_NumOfJob_2016_ByAge = df_NumOfJob_2016.loc[
    (df_NumOfJob_2016['Characteristics'] == '15 to 24 years') |
    (df_NumOfJob_2016['Characteristics'] == '25 to 34 years') |
    (df_NumOfJob_2016['Characteristics'] == '35 to 44 years') |
    (df_NumOfJob_2016['Characteristics'] == '45 to 54 years') |
    (df_NumOfJob_2016['Characteristics'] == '55 to 64 years') |
    (df_NumOfJob_2016['Characteristics'] == '65 years old and over')]
# print(df_NumOfJob_2016_ByAge.head(20))
grouped = df_NumOfJob_2016_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2016_ByAge.index))

print("\nGender group in Alberta")
df_NumOfJob_2016_ByGender = df_NumOfJob_2016.loc[
    (df_NumOfJob_2016['Characteristics'] == 'Female employees') |
    (df_NumOfJob_2016['Characteristics'] == 'Male employees')
]
# print(df_NumOfJob_2016_ByGender.head(20))
grouped = df_NumOfJob_2016_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2016_ByGender.index))

print("\nEducation group in Alberta")
df_NumOfJob_2016_ByEducation = df_NumOfJob_2016.loc[
    (df_NumOfJob_2016['Characteristics'] == 'High school diploma and less') |
    (df_NumOfJob_2016['Characteristics'] == 'Trade certificate') |
    (df_NumOfJob_2016['Characteristics'] == 'University degree and higher')
]
# print(df_NumOfJob_2016_ByEducation.head(20))
grouped = df_NumOfJob_2016_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2016_ByEducation.index))

print("\nImmigrant group in Alberta")
df_NumOfJob_2016_ByImmigrant = df_NumOfJob_2016.loc[
    (df_NumOfJob_2016['Characteristics'] == 'Immigrant employees') |
    (df_NumOfJob_2016['Characteristics'] == 'Non-immigrant employees')
]
# print(df_NumOfJob_2016_ByImmigrant.head(20))
grouped = df_NumOfJob_2016_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2016_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_NumOfJob_2016_ByIndigenous = df_NumOfJob_2016.loc[
#     (df_NumOfJob_2016['Characteristics'] == 'Indigenous identity employees') |
#     (df_NumOfJob_2016['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_NumOfJob_2016_ByIndigenous.head(20))
# grouped = df_NumOfJob_2016_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_NumOfJob_2016_ByIndigenous.index))


Age group in Alberta
                             sum  size
Characteristics                       
15 to 24 years         3568591.0   192
25 to 34 years         7218822.0   210
35 to 44 years         7058062.0   210
45 to 54 years         7243040.0   210
55 to 64 years         6305020.0   210
65 years old and over  2240484.0   192
The total number of this one is  1224

Gender group in Alberta
                         sum  size
Characteristics                   
Female employees  23313522.0   210
Male employees    10322788.0   210
The total number of this one is  420

Education group in Alberta
                                     sum  size
Characteristics                               
High school diploma and less   7346224.0   210
Trade certificate              2312220.0   198
University degree and higher  15322286.0   198
The total number of this one is  606

Immigrant group in Alberta
                                sum  size
Characteristics                          
Immigrant empl

In [75]:
# Dataset year in 2019 inside "Number of jobs"

print("\nAge group in Alberta")
df_NumOfJob_2019_ByAge = df_NumOfJob_2019.loc[
    (df_NumOfJob_2019['Characteristics'] == '15 to 24 years') |
    (df_NumOfJob_2019['Characteristics'] == '25 to 34 years') |
    (df_NumOfJob_2019['Characteristics'] == '35 to 44 years') |
    (df_NumOfJob_2019['Characteristics'] == '45 to 54 years') |
    (df_NumOfJob_2019['Characteristics'] == '55 to 64 years') |
    (df_NumOfJob_2019['Characteristics'] == '65 years old and over')]
# print(df_NumOfJob_2019_ByAge.head(20))
grouped = df_NumOfJob_2019_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2019_ByAge.index))

print("\nGender group in Alberta")
df_NumOfJob_2019_ByGender = df_NumOfJob_2019.loc[
    (df_NumOfJob_2019['Characteristics'] == 'Female employees') |
    (df_NumOfJob_2019['Characteristics'] == 'Male employees')
]
# print(df_NumOfJob_2019_ByGender.head(20))
grouped = df_NumOfJob_2019_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2019_ByGender.index))

print("\nEducation group in Alberta")
df_NumOfJob_2019_ByEducation = df_NumOfJob_2019.loc[
    (df_NumOfJob_2019['Characteristics'] == 'High school diploma and less') |
    (df_NumOfJob_2019['Characteristics'] == 'Trade certificate') |
    (df_NumOfJob_2019['Characteristics'] == 'University degree and higher')
]
# print(df_NumOfJob_2019_ByEducation.head(20))
grouped = df_NumOfJob_2019_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2019_ByEducation.index))

print("\nImmigrant group in Alberta")
df_NumOfJob_2019_ByImmigrant = df_NumOfJob_2019.loc[
    (df_NumOfJob_2019['Characteristics'] == 'Immigrant employees') |
    (df_NumOfJob_2019['Characteristics'] == 'Non-immigrant employees')
]
# print(df_NumOfJob_2019_ByImmigrant.head(20))
grouped = df_NumOfJob_2019_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_NumOfJob_2019_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_NumOfJob_2019_ByIndigenous = df_NumOfJob_2019.loc[
#     (df_NumOfJob_2019['Characteristics'] == 'Indigenous identity employees') |
#     (df_NumOfJob_2019['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_NumOfJob_2019_ByIndigenous.head(20))
# grouped = df_NumOfJob_2019_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_NumOfJob_2019_ByIndigenous.index))


Age group in Alberta
                             sum  size
Characteristics                       
15 to 24 years         3414206.0   192
25 to 34 years         7615612.0   210
35 to 44 years         7559436.0   210
45 to 54 years         7201656.0   210
55 to 64 years         6451160.0   210
65 years old and over  2416851.0   192
The total number of this one is  1224

Gender group in Alberta
                         sum  size
Characteristics                   
Female employees  24112770.0   210
Male employees    10548592.0   210
The total number of this one is  420

Education group in Alberta
                                     sum  size
Characteristics                               
High school diploma and less   7204888.0   210
Trade certificate              2245738.0   198
University degree and higher  16513042.0   198
The total number of this one is  606

Immigrant group in Alberta
                                sum  size
Characteristics                          
Immigrant empl

Filtered for "Wages and Salaries" by following: "Age group", "Gender level", "Education level", and "Immigration status". <br />
"Aboriginal status" has been commented.

In [76]:
# # Dataset year in 2010 inside "Wages and Salaries"

# print("\nAge group in Alberta")
# df_WagesAndSalaries_2010_ByAge = df_WagesAndSalaries_2010.loc[
#     (df_WagesAndSalaries_2010['Characteristics'] == '15 to 24 years') |
#     (df_WagesAndSalaries_2010['Characteristics'] == '25 to 34 years') |
#     (df_WagesAndSalaries_2010['Characteristics'] == '35 to 44 years') |
#     (df_WagesAndSalaries_2010['Characteristics'] == '45 to 54 years') |
#     (df_WagesAndSalaries_2010['Characteristics'] == '55 to 64 years') |
#     (df_WagesAndSalaries_2010['Characteristics'] == '65 years old and over')]
# # print(df_WagesAndSalaries_2010_ByAge.head(20))
# grouped = df_WagesAndSalaries_2010_ByAge.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_WagesAndSalaries_2010_ByAge.index))

# print("\nGender group in Alberta")
# df_WagesAndSalaries_2010_ByGender = df_WagesAndSalaries_2010.loc[
#     (df_WagesAndSalaries_2010['Characteristics'] == 'Female employees') |
#     (df_WagesAndSalaries_2010['Characteristics'] == 'Male employees')
# ]
# # print(df_WagesAndSalaries_2010_ByGender.head(20))
# grouped = df_WagesAndSalaries_2010_ByGender.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_WagesAndSalaries_2010_ByGender.index))

# print("\nEducation group in Alberta")
# df_WagesAndSalaries_2010_ByEducation = df_WagesAndSalaries_2010.loc[
#     (df_WagesAndSalaries_2010['Characteristics'] == 'High school diploma and less') |
#     (df_WagesAndSalaries_2010['Characteristics'] == 'Trade certificate') |
#     (df_WagesAndSalaries_2010['Characteristics'] == 'University degree and higher')
# ]
# # print(df_WagesAndSalaries_2010_ByEducation.head(20))
# grouped = df_WagesAndSalaries_2010_ByEducation.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_WagesAndSalaries_2010_ByEducation.index))

# print("\nImmigrant group in Alberta")
# df_WagesAndSalaries_2010_ByImmigrant = df_WagesAndSalaries_2010.loc[
#     (df_WagesAndSalaries_2010['Characteristics'] == 'Immigrant employees') |
#     (df_WagesAndSalaries_2010['Characteristics'] == 'Non-immigrant employees')
# ]
# # print(df_WagesAndSalaries_2010_ByImmigrant.head(20))
# grouped = df_WagesAndSalaries_2010_ByImmigrant.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_WagesAndSalaries_2010_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_WagesAndSalaries_2010_ByIndigenous = df_WagesAndSalaries_2010.loc[
#     (df_WagesAndSalaries_2010['Characteristics'] == 'Indigenous identity employees') |
#     (df_WagesAndSalaries_2010['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_WagesAndSalaries_2010_ByIndigenous.head(20))
# grouped = df_WagesAndSalaries_2010_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_WagesAndSalaries_2010_ByIndigenous.index))

In [77]:
# Dataset year in 2013 inside "Wages and Salaries"

print("\nAge group in Alberta")
df_WagesAndSalaries_2013_ByAge = df_WagesAndSalaries_2013.loc[
    (df_WagesAndSalaries_2013['Characteristics'] == '15 to 24 years') |
    (df_WagesAndSalaries_2013['Characteristics'] == '25 to 34 years') |
    (df_WagesAndSalaries_2013['Characteristics'] == '35 to 44 years') |
    (df_WagesAndSalaries_2013['Characteristics'] == '45 to 54 years') |
    (df_WagesAndSalaries_2013['Characteristics'] == '55 to 64 years') |
    (df_WagesAndSalaries_2013['Characteristics'] == '65 years old and over')]
# print(df_WagesAndSalaries_2013_ByAge.head(20))
grouped = df_WagesAndSalaries_2013_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2013_ByAge.index))

print("\nGender group in Alberta")
df_WagesAndSalaries_2013_ByGender = df_WagesAndSalaries_2013.loc[
    (df_WagesAndSalaries_2013['Characteristics'] == 'Female employees') |
    (df_WagesAndSalaries_2013['Characteristics'] == 'Male employees')
]
# print(df_WagesAndSalaries_2013_ByGender.head(20))
grouped = df_WagesAndSalaries_2013_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2013_ByGender.index))

print("\nEducation group in Alberta")
df_WagesAndSalaries_2013_ByEducation = df_WagesAndSalaries_2013.loc[
    (df_WagesAndSalaries_2013['Characteristics'] == 'High school diploma and less') |
    (df_WagesAndSalaries_2013['Characteristics'] == 'Trade certificate') |
    (df_WagesAndSalaries_2013['Characteristics'] == 'University degree and higher')
]
# print(df_WagesAndSalaries_2013_ByEducation.head(20))
grouped = df_WagesAndSalaries_2013_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2013_ByEducation.index))

print("\nImmigrant group in Alberta")
df_WagesAndSalaries_2013_ByImmigrant = df_WagesAndSalaries_2013.loc[
    (df_WagesAndSalaries_2013['Characteristics'] == 'Immigrant employees') |
    (df_WagesAndSalaries_2013['Characteristics'] == 'Non-immigrant employees')
]
# print(df_WagesAndSalaries_2013_ByImmigrant.head(20))
grouped = df_WagesAndSalaries_2013_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2013_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_WagesAndSalaries_2013_ByIndigenous = df_WagesAndSalaries_2013.loc[
#     (df_WagesAndSalaries_2013['Characteristics'] == 'Indigenous identity employees') |
#     (df_WagesAndSalaries_2013['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_WagesAndSalaries_2013_ByIndigenous.head(20))
# grouped = df_WagesAndSalaries_2013_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_WagesAndSalaries_2013_ByIndigenous.index))


Age group in Alberta
                            sum  size
Characteristics                      
15 to 24 years          51983.0   192
25 to 34 years         252992.0   210
35 to 44 years         334759.0   210
45 to 54 years         410954.0   210
55 to 64 years         313103.0   210
65 years old and over   62434.0   192
The total number of this one is  1224

Gender group in Alberta
                       sum  size
Characteristics                 
Female employees  926997.0   210
Male employees    499293.0   210
The total number of this one is  420

Education group in Alberta
                                   sum  size
Characteristics                             
High school diploma and less  191721.0   210
Trade certificate              86569.0   198
University degree and higher  793734.0   198
The total number of this one is  606

Immigrant group in Alberta
                               sum  size
Characteristics                         
Immigrant employees       348492.0   198
N

In [78]:
# Dataset year in 2016 inside "Wages and Salaries"

print("\nAge group in Alberta")
df_WagesAndSalaries_2016_ByAge = df_WagesAndSalaries_2016.loc[
    (df_WagesAndSalaries_2016['Characteristics'] == '15 to 24 years') |
    (df_WagesAndSalaries_2016['Characteristics'] == '25 to 34 years') |
    (df_WagesAndSalaries_2016['Characteristics'] == '35 to 44 years') |
    (df_WagesAndSalaries_2016['Characteristics'] == '45 to 54 years') |
    (df_WagesAndSalaries_2016['Characteristics'] == '55 to 64 years') |
    (df_WagesAndSalaries_2016['Characteristics'] == '65 years old and over')]
# print(df_WagesAndSalaries_2016_ByAge.head(20))
grouped = df_WagesAndSalaries_2016_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2016_ByAge.index))

print("\nGender group in Alberta")
df_WagesAndSalaries_2016_ByGender = df_WagesAndSalaries_2016.loc[
    (df_WagesAndSalaries_2016['Characteristics'] == 'Female employees') |
    (df_WagesAndSalaries_2016['Characteristics'] == 'Male employees')
]
# print(df_WagesAndSalaries_2016_ByGender.head(20))
grouped = df_WagesAndSalaries_2016_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2016_ByGender.index))

print("\nEducation group in Alberta")
df_WagesAndSalaries_2016_ByEducation = df_WagesAndSalaries_2016.loc[
    (df_WagesAndSalaries_2016['Characteristics'] == 'High school diploma and less') |
    (df_WagesAndSalaries_2016['Characteristics'] == 'Trade certificate') |
    (df_WagesAndSalaries_2016['Characteristics'] == 'University degree and higher')
]
# print(df_WagesAndSalaries_2016_ByEducation.head(20))
grouped = df_WagesAndSalaries_2016_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2016_ByEducation.index))

print("\nImmigrant group in Alberta")
df_WagesAndSalaries_2016_ByImmigrant = df_WagesAndSalaries_2016.loc[
    (df_WagesAndSalaries_2016['Characteristics'] == 'Immigrant employees') |
    (df_WagesAndSalaries_2016['Characteristics'] == 'Non-immigrant employees')
]
# print(df_WagesAndSalaries_2016_ByImmigrant.head(20))
grouped = df_WagesAndSalaries_2016_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2016_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_WagesAndSalaries_2016_ByIndigenous = df_WagesAndSalaries_2016.loc[
#     (df_WagesAndSalaries_2016['Characteristics'] == 'Indigenous identity employees') |
#     (df_WagesAndSalaries_2016['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_WagesAndSalaries_2016_ByIndigenous.head(20))
# grouped = df_WagesAndSalaries_2016_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_WagesAndSalaries_2016_ByIndigenous.index))



Age group in Alberta
                            sum  size
Characteristics                      
15 to 24 years          56990.0   192
25 to 34 years         285555.0   210
35 to 44 years         376045.0   210
45 to 54 years         431853.0   210
55 to 64 years         344362.0   210
65 years old and over   76349.0   192
The total number of this one is  1224

Gender group in Alberta
                        sum  size
Characteristics                  
Female employees  1026881.0   210
Male employees     544329.0   210
The total number of this one is  420

Education group in Alberta
                                   sum  size
Characteristics                             
High school diploma and less  207705.0   210
Trade certificate              86692.0   198
University degree and higher  896636.0   198
The total number of this one is  606

Immigrant group in Alberta
                               sum  size
Characteristics                         
Immigrant employees       400833.0   1

In [79]:
# Dataset year in 2019 inside "Wages and Salaries"

print("\nAge group in Alberta")
df_WagesAndSalaries_2019_ByAge = df_WagesAndSalaries_2019.loc[
    (df_WagesAndSalaries_2019['Characteristics'] == '15 to 24 years') |
    (df_WagesAndSalaries_2019['Characteristics'] == '25 to 34 years') |
    (df_WagesAndSalaries_2019['Characteristics'] == '35 to 44 years') |
    (df_WagesAndSalaries_2019['Characteristics'] == '45 to 54 years') |
    (df_WagesAndSalaries_2019['Characteristics'] == '55 to 64 years') |
    (df_WagesAndSalaries_2019['Characteristics'] == '65 years old and over')]
# print(df_WagesAndSalaries_2019_ByAge.head(20))
grouped = df_WagesAndSalaries_2019_ByAge.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2019_ByAge.index))

print("\nGender group in Alberta")
df_WagesAndSalaries_2019_ByGender = df_WagesAndSalaries_2019.loc[
    (df_WagesAndSalaries_2019['Characteristics'] == 'Female employees') |
    (df_WagesAndSalaries_2019['Characteristics'] == 'Male employees')
]
# print(df_WagesAndSalaries_2019_ByGender.head(20))
grouped = df_WagesAndSalaries_2019_ByGender.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2019_ByGender.index))

print("\nEducation group in Alberta")
df_WagesAndSalaries_2019_ByEducation = df_WagesAndSalaries_2019.loc[
    (df_WagesAndSalaries_2019['Characteristics'] == 'High school diploma and less') |
    (df_WagesAndSalaries_2019['Characteristics'] == 'Trade certificate') |
    (df_WagesAndSalaries_2019['Characteristics'] == 'University degree and higher')
]
# print(df_WagesAndSalaries_2019_ByEducation.head(20))
grouped = df_WagesAndSalaries_2019_ByEducation.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2019_ByEducation.index))

print("\nImmigrant group in Alberta")
df_WagesAndSalaries_2019_ByImmigrant = df_WagesAndSalaries_2019.loc[
    (df_WagesAndSalaries_2019['Characteristics'] == 'Immigrant employees') |
    (df_WagesAndSalaries_2019['Characteristics'] == 'Non-immigrant employees')
]
# print(df_WagesAndSalaries_2019_ByImmigrant.head(20))
grouped = df_WagesAndSalaries_2019_ByImmigrant.groupby(['Characteristics'])
print(grouped['VALUE'].agg([np.sum, np.size]))
print("The total number of this one is ",len(df_WagesAndSalaries_2019_ByImmigrant.index))

# print("\nIndigenous group in Alberta")
# df_WagesAndSalaries_2019_ByIndigenous = df_WagesAndSalaries_2019.loc[
#     (df_WagesAndSalaries_2019['Characteristics'] == 'Indigenous identity employees') |
#     (df_WagesAndSalaries_2019['Characteristics'] == 'Non-indigenous identity employees')
# ]
# # print(df_WagesAndSalaries_2019_ByIndigenous.head(20))
# grouped = df_WagesAndSalaries_2019_ByIndigenous.groupby(['Characteristics'])
# print(grouped['VALUE'].agg([np.sum, np.size]))
# print("The total number of this one is ",len(df_WagesAndSalaries_2019_ByIndigenous.index))


Age group in Alberta
                            sum  size
Characteristics                      
15 to 24 years          63069.0   192
25 to 34 years         334200.0   210
35 to 44 years         441049.0   210
45 to 54 years         474202.0   210
55 to 64 years         386319.0   210
65 years old and over   90513.0   192
The total number of this one is  1224

Gender group in Alberta
                        sum  size
Characteristics                  
Female employees  1179670.0   210
Male employees     609748.0   210
The total number of this one is  420

Education group in Alberta
                                    sum  size
Characteristics                              
High school diploma and less   226638.0   210
Trade certificate               93374.0   198
University degree and higher  1052815.0   198
The total number of this one is  606

Immigrant group in Alberta
                               sum  size
Characteristics                         
Immigrant employees       473697.

Final output for Average annual hours worked

In [80]:
# dfa_Target_To_Analysis = [df_AvgAnnHrsWrk_2010_ByAge, df_AvgAnnHrsWrk_2010_ByEducation, df_AvgAnnHrsWrk_2010_ByEducation, df_AvgAnnHrsWrk_2010_ByImmigrant]
# for df_Target_To_Analysis in dfa_Target_To_Analysis:
#       grouped = df_Target_To_Analysis.groupby(['Characteristics'])
#       print("Year of 2010")
#       print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
#       print("Overall,")
#       print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
#       print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
#       print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
#             np.median(df_Target_To_Analysis['VALUE']),"/",
#             np.max(df_Target_To_Analysis['VALUE']))
#       print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
#       print("Total size : ",len(df_Target_To_Analysis.index))

# print()
dfa_Target_To_Analysis = [df_AvgAnnHrsWrk_2013_ByAge, df_AvgAnnHrsWrk_2013_ByEducation, df_AvgAnnHrsWrk_2013_ByEducation, df_AvgAnnHrsWrk_2013_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2013")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_AvgAnnHrsWrk_2016_ByAge, df_AvgAnnHrsWrk_2016_ByEducation, df_AvgAnnHrsWrk_2016_ByEducation, df_AvgAnnHrsWrk_2016_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2016")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_AvgAnnHrsWrk_2019_ByAge, df_AvgAnnHrsWrk_2019_ByEducation, df_AvgAnnHrsWrk_2019_ByEducation, df_AvgAnnHrsWrk_2019_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2020")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

Year of 2013
                            sum         mean    amin  median    amax  size
Characteristics                                                           
15 to 24 years         172087.0   896.286458   462.0   906.0  1108.0   192
25 to 34 years         340647.0  1622.128571  1236.0  1615.0  1946.0   210
35 to 44 years         377140.0  1795.904762  1477.0  1760.5  2183.0   210
45 to 54 years         387957.0  1847.414286  1523.0  1834.5  2294.0   210
55 to 64 years         352301.0  1677.623810  1390.0  1680.5  2000.0   210
65 years old and over  207070.0  1078.489583   724.0  1078.0  1500.0   192
Overall,
Sum :  1837202.0
Mean :  1500.9820261437908
Min/median/max : 462.0 / 1649.5 / 2294.0
Skewnewss :  -0.5937132979460663
Total size :  1224
Year of 2013
                                   sum         mean    amin  median    amax  \
Characteristics                                                               
High school diploma and less  277682.0  1322.295238  1065.0  1323.5  1

Final output for "Average annual wages and salaries"

In [81]:
# dfa_Target_To_Analysis = [df_AvgAnnWages_2010_ByAge, df_AvgAnnWages_2010_ByEducation, df_AvgAnnWages_2010_ByEducation, df_AvgAnnWages_2010_ByImmigrant]
# for df_Target_To_Analysis in dfa_Target_To_Analysis:
#       grouped = df_Target_To_Analysis.groupby(['Characteristics'])
#       print("Year of 2010")
#       print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
#       print("Overall,")
#       print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
#       print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
#       print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
#             np.median(df_Target_To_Analysis['VALUE']),"/",
#             np.max(df_Target_To_Analysis['VALUE']))
#       print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
#       print("Total size : ",len(df_Target_To_Analysis.index))

# print()
dfa_Target_To_Analysis = [df_AvgAnnWages_2013_ByAge, df_AvgAnnWages_2013_ByEducation, df_AvgAnnWages_2013_ByEducation, df_AvgAnnWages_2013_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2013")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_AvgAnnWages_2016_ByAge, df_AvgAnnWages_2016_ByEducation, df_AvgAnnWages_2016_ByEducation, df_AvgAnnWages_2016_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2016")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_AvgAnnWages_2019_ByAge, df_AvgAnnWages_2019_ByEducation, df_AvgAnnWages_2019_ByEducation, df_AvgAnnWages_2019_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2020")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

Year of 2013
                              sum          mean     amin   median      amax  \
Characteristics                                                               
15 to 24 years          2784616.0  14503.208333   8769.0  13954.0   24898.0   
25 to 34 years          7953813.0  37875.300000  23433.0  36952.5   69687.0   
35 to 44 years         10794086.0  51400.409524  33680.0  50296.0   92783.0   
45 to 54 years         11596540.0  55221.619048  30761.0  53607.0   87000.0   
55 to 64 years         11087673.0  52798.442857  28716.0  49704.0  105273.0   
65 years old and over   5949938.0  30989.260417  14168.0  28631.0   81488.0   

                       size  
Characteristics              
15 to 24 years          192  
25 to 34 years          210  
35 to 44 years          210  
45 to 54 years          210  
55 to 64 years          210  
65 years old and over   192  
Overall,
Sum :  50166666.0
Mean :  40985.83823529412
Min/median/max : 8769.0 / 40621.5 / 105273.0
Skewnewss :  0.4

Final output for "Average hourly wage"

In [82]:
# dfa_Target_To_Analysis = [df_AvgHrsWages_2010_ByAge, df_AvgHrsWages_2010_ByEducation, df_AvgHrsWages_2010_ByEducation, df_AvgHrsWages_2010_ByImmigrant]
# for df_Target_To_Analysis in dfa_Target_To_Analysis:
#       grouped = df_Target_To_Analysis.groupby(['Characteristics'])
#       print("Year of 2010")
#       print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
#       print("Overall,")
#       print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
#       print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
#       print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
#             np.median(df_Target_To_Analysis['VALUE']),"/",
#             np.max(df_Target_To_Analysis['VALUE']))
#       print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
#       print("Total size : ",len(df_Target_To_Analysis.index))

# print()
dfa_Target_To_Analysis = [df_AvgHrsWages_2013_ByAge, df_AvgHrsWages_2013_ByEducation, df_AvgHrsWages_2013_ByEducation, df_AvgHrsWages_2013_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2013")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_AvgHrsWages_2016_ByAge, df_AvgHrsWages_2016_ByEducation, df_AvgHrsWages_2016_ByEducation, df_AvgHrsWages_2016_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2016")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_AvgHrsWages_2019_ByAge, df_AvgHrsWages_2019_ByEducation, df_AvgHrsWages_2019_ByEducation, df_AvgHrsWages_2019_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2019")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

Year of 2013
                           sum       mean   amin  median   amax  size
Characteristics                                                      
15 to 24 years         3089.01  16.088594  11.70  15.375  26.00   192
25 to 34 years         4937.02  23.509619  15.07  22.835  47.75   210
35 to 44 years         6053.85  28.827857  17.22  28.400  54.92   210
45 to 54 years         6297.06  29.986000  16.89  29.935  50.43   210
55 to 64 years         6605.48  31.454667  16.73  29.800  63.50   210
65 years old and over  5475.51  28.518281  15.90  27.170  69.41   192
Overall,
Sum :  32457.93
Mean :  26.517916666666668
Min/median/max : 11.7 / 25.675 / 69.41
Skewnewss :  1.0223945014335014
Total size :  1224
Year of 2013
                                  sum       mean   amin  median   amax  size
Characteristics                                                             
High school diploma and less  4434.80  21.118095  12.76  20.295  35.57   210
Trade certificate             4774.81  24

Final output for "Average weekly hours worked"

In [83]:
# dfa_Target_To_Analysis = [df_AvgWeekHrsWrked_2010_ByAge, df_AvgWeekHrsWrked_2010_ByEducation, df_AvgWeekHrsWrked_2010_ByEducation, df_AvgWeekHrsWrked_2010_ByImmigrant]
# for df_Target_To_Analysis in dfa_Target_To_Analysis:
#       grouped = df_Target_To_Analysis.groupby(['Characteristics'])
#       print("Year of 2010")
#       print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
#       print("Overall,")
#       print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
#       print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
#       print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
#             np.median(df_Target_To_Analysis['VALUE']),"/",
#             np.max(df_Target_To_Analysis['VALUE']))
#       print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
#       print("Total size : ",len(df_Target_To_Analysis.index))

# print()
dfa_Target_To_Analysis = [df_AvgWeekHrsWrked_2013_ByAge, df_AvgWeekHrsWrked_2013_ByEducation, df_AvgWeekHrsWrked_2013_ByEducation, df_AvgWeekHrsWrked_2013_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2013")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_AvgWeekHrsWrked_2016_ByAge, df_AvgWeekHrsWrked_2016_ByEducation, df_AvgWeekHrsWrked_2016_ByEducation, df_AvgWeekHrsWrked_2016_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2016")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_AvgWeekHrsWrked_2019_ByAge, df_AvgWeekHrsWrked_2019_ByEducation, df_AvgWeekHrsWrked_2019_ByEducation, df_AvgWeekHrsWrked_2019_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2019")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

Year of 2013
                          sum       mean  amin  median  amax  size
Characteristics                                                   
15 to 24 years         3314.0  17.260417   9.0    17.0  21.0   192
25 to 34 years         6556.0  31.219048  24.0    31.0  37.0   210
35 to 44 years         7256.0  34.552381  28.0    34.0  42.0   210
45 to 54 years         7454.0  35.495238  29.0    35.0  44.0   210
55 to 64 years         6780.0  32.285714  27.0    32.0  38.0   210
65 years old and over  3983.0  20.744792  14.0    21.0  29.0   192
Overall,
Sum :  35343.0
Mean :  28.875
Min/median/max : 9.0 / 32.0 / 44.0
Skewnewss :  -0.5948697613902876
Total size :  1224
Year of 2013
                                 sum       mean  amin  median  amax  size
Characteristics                                                          
High school diploma and less  5334.0  25.400000  20.0    25.0  32.0   210
Trade certificate             6034.0  30.474747  21.0    30.0  37.0   198
University degre

Final output for "Hours Worked"

In [84]:
# dfa_Target_To_Analysis = [df_Hrs_Wrked_2010_ByAge, df_Hrs_Wrked_2010_ByEducation, df_Hrs_Wrked_2010_ByEducation, df_Hrs_Wrked_2010_ByImmigrant]
# for df_Target_To_Analysis in dfa_Target_To_Analysis:
#       grouped = df_Target_To_Analysis.groupby(['Characteristics'])
#       print("Year of 2010")
#       print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
#       print("Overall,")
#       print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
#       print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
#       print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
#             np.median(df_Target_To_Analysis['VALUE']),"/",
#             np.max(df_Target_To_Analysis['VALUE']))
#       print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
#       print("Total size : ",len(df_Target_To_Analysis.index))

# print()
dfa_Target_To_Analysis = [df_Hrs_Wrked_2013_ByAge, df_Hrs_Wrked_2013_ByEducation, df_Hrs_Wrked_2013_ByEducation, df_Hrs_Wrked_2013_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2013")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_Hrs_Wrked_2016_ByAge, df_Hrs_Wrked_2016_ByEducation, df_Hrs_Wrked_2016_ByEducation, df_Hrs_Wrked_2016_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2016")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_Hrs_Wrked_2019_ByAge, df_Hrs_Wrked_2019_ByEducation, df_Hrs_Wrked_2019_ByEducation, df_Hrs_Wrked_2019_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2019")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

Year of 2013
                              sum          mean  amin   median      amax  size
Characteristics                                                               
15 to 24 years          3107754.0  16186.218750   6.0   3910.0  216618.0   192
25 to 34 years         10595996.0  50457.123810  32.0   9133.5  770449.0   210
35 to 44 years         11379938.0  54190.180952  32.0   9535.5  826253.0   210
45 to 54 years         13158896.0  62661.409524  29.0  10672.5  955066.0   210
55 to 64 years          9899434.0  47140.161905  20.0   8386.5  728243.0   210
65 years old and over   1974184.0  10282.208333  26.0   2329.0  146525.0   192
Overall,
Sum :  50116202.0
Mean :  40944.609477124184
Min/median/max : 6.0 / 6028.0 / 955066.0
Skewnewss :  5.157256992066241
Total size :  1224
Year of 2013
                                     sum           mean  amin   median  \
Characteristics                                                          
High school diploma and less   9259138.0   44091.

Final output for "Number of jobs"

In [85]:
# dfa_Target_To_Analysis = [df_NumOfJob_2010_ByAge, df_NumOfJob_2010_ByEducation, df_NumOfJob_2010_ByEducation, df_NumOfJob_2010_ByImmigrant]
# for df_Target_To_Analysis in dfa_Target_To_Analysis:
#       grouped = df_Target_To_Analysis.groupby(['Characteristics'])
#       print("Year of 2010")
#       print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
#       print("Overall,")
#       print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
#       print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
#       print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
#             np.median(df_Target_To_Analysis['VALUE']),"/",
#             np.max(df_Target_To_Analysis['VALUE']))
#       print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
#       print("Total size : ",len(df_Target_To_Analysis.index))

# print()
dfa_Target_To_Analysis = [df_NumOfJob_2013_ByAge, df_NumOfJob_2013_ByEducation, df_NumOfJob_2013_ByEducation, df_NumOfJob_2013_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2013")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_NumOfJob_2016_ByAge, df_NumOfJob_2016_ByEducation, df_NumOfJob_2016_ByEducation, df_NumOfJob_2016_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2016")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_NumOfJob_2019_ByAge, df_NumOfJob_2019_ByEducation, df_NumOfJob_2019_ByEducation, df_NumOfJob_2019_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2019")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

Year of 2013
                             sum          mean  amin  median      amax  size
Characteristics                                                             
15 to 24 years         3475038.0  18099.156250  13.0  4155.0  241122.0   192
25 to 34 years         6767092.0  32224.247619  17.0  5538.0  495502.0   210
35 to 44 years         6596518.0  31411.990476  18.0  5526.0  481136.0   210
45 to 54 years         7311054.0  34814.542857  14.0  5827.0  530812.0   210
55 to 64 years         6008488.0  28611.847619  11.0  4965.5  441682.0   210
65 years old and over  1910168.0   9948.791667  18.0  2144.5  138944.0   192
Overall,
Sum :  32068358.0
Mean :  26199.63888888889
Min/median/max : 11.0 / 4155.0 / 530812.0
Skewnewss :  4.8267095355895036
Total size :  1224
Year of 2013
                                     sum          mean  amin   median  \
Characteristics                                                         
High school diploma and less   7081390.0  33720.904762  36.0   795

Final output for "Wages and Salaries"

In [86]:
# dfa_Target_To_Analysis = [df_WagesAndSalaries_2010_ByAge, df_WagesAndSalaries_2010_ByEducation, df_WagesAndSalaries_2010_ByEducation, df_WagesAndSalaries_2010_ByImmigrant]
# for df_Target_To_Analysis in dfa_Target_To_Analysis:
#       grouped = df_Target_To_Analysis.groupby(['Characteristics'])
#       print("Year of 2010")
#       print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
#       print("Overall,")
#       print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
#       print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
#       print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
#             np.median(df_Target_To_Analysis['VALUE']),"/",
#             np.max(df_Target_To_Analysis['VALUE']))
#       print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
#       print("Total size : ",len(df_Target_To_Analysis.index))

# print()
dfa_Target_To_Analysis = [df_WagesAndSalaries_2013_ByAge, df_WagesAndSalaries_2013_ByEducation, df_WagesAndSalaries_2013_ByEducation, df_WagesAndSalaries_2013_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2013")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_WagesAndSalaries_2016_ByAge, df_WagesAndSalaries_2016_ByEducation, df_WagesAndSalaries_2016_ByEducation, df_WagesAndSalaries_2016_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2016")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

print()
dfa_Target_To_Analysis = [df_WagesAndSalaries_2019_ByAge, df_WagesAndSalaries_2019_ByEducation, df_WagesAndSalaries_2019_ByEducation, df_WagesAndSalaries_2019_ByImmigrant]
for df_Target_To_Analysis in dfa_Target_To_Analysis:
      grouped = df_Target_To_Analysis.groupby(['Characteristics'])
      print("Year of 2019")
      print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
      print("Overall,")
      print("Sum : ",np.sum(df_Target_To_Analysis['VALUE']))
      print("Mean : ",np.mean(df_Target_To_Analysis['VALUE']))
      print("Min/median/max :",np.min(df_Target_To_Analysis['VALUE']),"/",
            np.median(df_Target_To_Analysis['VALUE']),"/",
            np.max(df_Target_To_Analysis['VALUE']))
      print("Skewnewss : ",df_Target_To_Analysis['VALUE'].skew())
      print("Total size : ",len(df_Target_To_Analysis.index))

Year of 2013
                            sum         mean  amin  median     amax  size
Characteristics                                                          
15 to 24 years          51983.0   270.744792   0.0    62.5   3770.0   192
25 to 34 years         252992.0  1204.723810   1.0   193.5  19337.0   210
35 to 44 years         334759.0  1594.090476   1.0   249.5  25471.0   210
45 to 54 years         410954.0  1956.923810   1.0   285.5  30946.0   210
55 to 64 years         313103.0  1490.966667   1.0   228.0  24134.0   210
65 years old and over   62434.0   325.177083   1.0    56.5   4980.0   192
Overall,
Sum :  1426225.0
Mean :  1165.2165032679738
Min/median/max : 0.0 / 131.5 / 30946.0
Skewnewss :  5.458155190569106
Total size :  1224
Year of 2013
                                   sum         mean  amin  median     amax  \
Characteristics                                                              
High school diploma and less  191721.0   912.957143   2.0   223.0  13803.0   
Trade 

In [87]:
print("Final steps, by sorting out by provinces.")

# -- sum          mean           std  size
# -- GEO                                                                    
# -- Alberta                     2193966.0   2031.450000   2695.836034  1080
# -- British Columbia            2401296.0   2223.422222   2804.925187  1080
# -- Canada                     18252439.0  16900.406481  22232.852533  1080
# -- Manitoba                     767802.0    710.927778    915.637659  1080
# -- New Brunswick                359320.0    332.703704    530.962762  1080
# -- Newfoundland and Labrador    315895.0    306.099806    482.634908  1032
# -- Northwest Territories         42804.0     41.476744     51.817046  1032
# -- Nova Scotia                  531805.0    492.412037    757.119411  1080
# -- Nunavut                       14235.0     15.208333     14.752372   936
# -- Ontario                     6601634.0   6112.624074   7594.433779  1080
# -- Prince Edward Island          77931.0     75.514535    121.297367  1032
# -- Quebec                      4271657.0   3955.237963   5580.294544  1080
# -- Saskatchewan                 650781.0    602.575000    876.896377  1080
# -- Yukon                         16914.0     18.070513     20.188135   936
# -- The total number of this one is  14688


Final steps, by sorting out by provinces.


As final step, I am classifying the data by province. This is the final step, and this is where I will get the final result with.<br />
For this step, I will use class methods to avoid duplicated and repeatitive steps to do programming.<br />
For the complex of the analysis, only the dataset of 2013-2015, 2016-2018, 2019-2021 will be used. <br />
However, 2010-2012 one will get commented.

In [88]:
print("With year of 2013 and above (three dataset), I will have to analyize, "+str(7*3*4*14)+" times.")
print("\nUse the year from 2010 to 2021 (four dataasets), I will have to analyize, "+str(7*4*4*14)+" times!")

With year of 2013 and above (three dataset), I will have to analyize, 1176 times.

Use the year from 2010 to 2021 (four dataasets), I will have to analyize, 1568 times!


Main class for Province Analysis:

In [89]:
# https://www.w3schools.com/python/python_classes.asp
# https://www.w3schools.com/python/python_for_loops.asp
# https://www.educba.com/multidimensional-array-in-python/

class ProvinceAnalysis:

    # Province :
    # -- ['Alberta',  'British Columbia',    'Canada' , 'Manitoba' , 'New Brunswick' 
    # 'Newfoundland and Labrador', 'Northwest Territories' , 'Nova Scotia' , 'Nunavut'
    # 'Ontario' , 'Prince Edward Island', 'Quebec', 'Saskatchewan', 'Yukon']

    def __init__(self, df, pd, np, pp):
        self.df = df
        self.province = ['Alberta',  'British Columbia', 'Canada', 'Manitoba', 
                        'New Brunswick', 'Newfoundland and Labrador', 
                        'Northwest Territories' , 'Nova Scotia' , 'Nunavut',
                        'Ontario' , 'Prince Edward Island', 'Quebec', 
                        'Saskatchewan', 'Yukon'
                        ]
        self.indicator = ["Average annual hours worked",
                        "Average annual wages and salaries",
                        "Average hourly wage",
                        "Average weekly hours worked",
                        "Hours Worked",
                        "Number of jobs",
                        "Wages and Salaries"]
        self.characteristic = ["Age group", "Gender", "Education Level", "Immigrant status", "Aboriginal status"]
        self.year = ["2010",
                    "below 2015",
                    "above 2016",
                    "2013",
                    "2016",
                    "2019"]
        self.pd = pd
        self.np = np
        self.pp = pp
        self.df_ByProvince = []
        for x in self.province:
            df_sorted = df.loc[df['GEO'] == x]
            self.df_ByProvince.append(df_sorted)

    def outputProvince(self, province_id):
        print(self.province[province_id])

    def outputIndicator(self, indicator_id):
        print(self.province[indicator_id])

    def outputCharacteristic(self, cha_id):
        print(self.province[cha_id])

    def outputYear(self, year_id):
        print(self.province[year_id])

    def outputAnalysis(self, province_id):
        print("\nGrab the dataset only in " + str(self.province[province_id]))
        grouped = self.df_ByProvince[province_id].groupby(['Characteristics'])
        print(grouped['VALUE'].agg([np.sum, np.mean, np.min, np.median, np.max, np.size]))
        print("")
        print("Overall,")
        print("Sum : ",np.sum(self.df_ByProvince[province_id]['VALUE']))
        print("Mean : ",np.mean(self.df_ByProvince[province_id]['VALUE']))
        print("Min/median/max :",np.min(self.df_ByProvince[province_id]['VALUE']),"/",
            np.median(self.df_ByProvince[province_id]['VALUE']),"/",
            np.max(self.df_ByProvince[province_id]['VALUE']))
        print("Skewnewss : ",self.df_ByProvince[province_id]['VALUE'].skew())
        print("Total size : ",len(self.df_ByProvince[province_id].index))

    def outputAnalysisSimple(self, province_id):
        print("\nGrab the dataset only in " + str(self.province[province_id]))
        grouped = self.df_ByProvince[province_id].groupby(['Characteristics'])
        print(grouped['VALUE'].agg([self.np.sum, self.np.mean, self.np.size]))

    def outputList(self, province_id, num):
        print("\nGrab the dataset only in " + str(self.province[province_id]))
        print(self.df_ByProvince[province_id].head(num))
        print(self.df_ByProvince[province_id].info())

    def outputPandaProfiling(self, province_id, indicator_id, type_id):

        fileName = str(self.indicator[indicator_id]) + " " + str(self.year[type_id])+" in " + str(self.province[province_id]) + ".html"
        
        pp = ProfileReport(self.df_ByProvince[province_id])
        pp_df = pp.to_html()

        print("File name will be saved under "+str(fileName))
        f = open(fileName, "a")  # Expert into html file without modifying any columns in dataset.
        f.write(pp_df)
        f.close()

Filtered by provinces by "Average annual hours worked"

In [90]:
# By Average annual hours worked categories by provinces.

# df_AvgAnnHrsWrk_ByAge_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_ByAge, pd, np, pp)
# df_AvgAnnHrsWrk_2010_ByAge_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge, pd, np, pp)
df_AvgAnnHrsWrk_2013_ByAge_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge, pd, np, pp)
df_AvgAnnHrsWrk_2016_ByAge_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge, pd, np, pp)
df_AvgAnnHrsWrk_2019_ByAge_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge, pd, np, pp)

# df_AvgAnnHrsWrk_ByGender_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_ByGender, pd, np, pp)
# df_AvgAnnHrsWrk_2010_ByGender_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGender, pd, np, pp)
df_AvgAnnHrsWrk_2013_ByGender_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender, pd, np, pp)
df_AvgAnnHrsWrk_2016_ByGender_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender, pd, np, pp)
df_AvgAnnHrsWrk_2019_ByGender_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender, pd, np, pp)

# df_AvgAnnHrsWrk_ByEducation_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_ByEducation, pd, np, pp)
# df_AvgAnnHrsWrk_2010_ByEducation_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation, pd, np, pp)
df_AvgAnnHrsWrk_2013_ByEducation_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation, pd, np, pp)
df_AvgAnnHrsWrk_2016_ByEducation_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation, pd, np, pp)
df_AvgAnnHrsWrk_2019_ByEducation_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation, pd, np, pp)

# df_AvgAnnHrsWrk_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_ByImmigrant, pd, np, pp)
# df_AvgAnnHrsWrk_2010_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByImmigrant, pd, np, pp)
df_AvgAnnHrsWrk_2013_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByImmigrant, pd, np, pp)
df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant, pd, np, pp)
df_AvgAnnHrsWrk_2019_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByImmigrant, pd, np, pp)

# df_AvgAnnHrsWrk_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_ByIndigenous, pd, np, pp)
# df_AvgAnnHrsWrk_2010_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByIndigenous, pd, np, pp)
# df_AvgAnnHrsWrk_2013_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByIndigenous, pd, np, pp)
# df_AvgAnnHrsWrk_2016_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByIndigenous, pd, np, pp)
# df_AvgAnnHrsWrk_2019_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByIndigenous, pd, np, pp)

Filtered by provinces by "Average wages and salaries"

In [91]:
# By Average annual wages and salaries worked categories by provinces.

# df_AvgAnnWages_ByAge_Provinces = ProvinceAnalysis(df_AvgAnnWages_ByAge, pd, np, pp)
# df_AvgAnnWages_2010_ByAge_Provinces = ProvinceAnalysis(df_AvgAnnWages_2010_ByAge, pd, np, pp)
df_AvgAnnWages_2013_ByAge_Provinces = ProvinceAnalysis(df_AvgAnnWages_2013_ByAge, pd, np, pp)
df_AvgAnnWages_2016_ByAge_Provinces = ProvinceAnalysis(df_AvgAnnWages_2016_ByAge, pd, np, pp)
df_AvgAnnWages_2019_ByAge_Provinces = ProvinceAnalysis(df_AvgAnnWages_2019_ByAge, pd, np, pp)

# df_AvgAnnWages_ByGender_Provinces = ProvinceAnalysis(df_AvgAnnWages_ByGender, pd, np, pp)
# df_AvgAnnWages_2010_ByGender_Provinces = ProvinceAnalysis(df_AvgAnnWages_2010_ByGender, pd, np, pp)
df_AvgAnnWages_2013_ByGender_Provinces = ProvinceAnalysis(df_AvgAnnWages_2013_ByGender, pd, np, pp)
df_AvgAnnWages_2016_ByGender_Provinces = ProvinceAnalysis(df_AvgAnnWages_2016_ByGender, pd, np, pp)
df_AvgAnnWages_2019_ByGender_Provinces = ProvinceAnalysis(df_AvgAnnWages_2019_ByGender, pd, np, pp)

# df_AvgAnnWages_ByEducation_Provinces = ProvinceAnalysis(df_AvgAnnWages_ByEducation, pd, np, pp)
# df_AvgAnnWages_2010_ByEducation_Provinces = ProvinceAnalysis(df_AvgAnnWages_2010_ByEducation, pd, np, pp)
df_AvgAnnWages_2013_ByEducation_Provinces = ProvinceAnalysis(df_AvgAnnWages_2013_ByEducation, pd, np, pp)
df_AvgAnnWages_2016_ByEducation_Provinces = ProvinceAnalysis(df_AvgAnnWages_2016_ByEducation, pd, np, pp)
df_AvgAnnWages_2019_ByEducation_Provinces = ProvinceAnalysis(df_AvgAnnWages_2019_ByEducation, pd, np, pp)

# df_AvgAnnWages_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgAnnWages_ByImmigrant, pd, np, pp)
# df_AvgAnnWages_2010_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgAnnWages_2010_ByImmigrant, pd, np, pp)
df_AvgAnnWages_2013_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgAnnWages_2013_ByImmigrant, pd, np, pp)
df_AvgAnnWages_2016_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgAnnWages_2016_ByImmigrant, pd, np, pp)
df_AvgAnnWages_2019_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgAnnWages_2019_ByImmigrant, pd, np, pp)

# df_AvgAnnWages_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgAnnWages_ByIndigenous, pd, np, pp)
# df_AvgAnnWages_2010_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgAnnWages_2010_ByIndigenous, pd, np, pp)
# df_AvgAnnWages_2013_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgAnnWages_2013_ByIndigenous, pd, np, pp)
# df_AvgAnnWages_2016_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgAnnWages_2016_ByIndigenous, pd, np, pp)
# df_AvgAnnWages_2019_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgAnnWages_2019_ByIndigenous, pd, np, pp)

Filtered by provinces by "Average hourly wage"

In [92]:
# By Average hourly wages and salaries worked categories by provinces.

# df_AvgHrsWages_ByAge_Provinces = ProvinceAnalysis(df_AvgHrsWages_ByAge, pd, np, pp)
# df_AvgHrsWages_2010_ByAge_Provinces = ProvinceAnalysis(df_AvgHrsWages_2010_ByAge, pd, np, pp)
df_AvgHrsWages_2013_ByAge_Provinces = ProvinceAnalysis(df_AvgHrsWages_2013_ByAge, pd, np, pp)
df_AvgHrsWages_2016_ByAge_Provinces = ProvinceAnalysis(df_AvgHrsWages_2016_ByAge, pd, np, pp)
df_AvgHrsWages_2019_ByAge_Provinces = ProvinceAnalysis(df_AvgHrsWages_2019_ByAge, pd, np, pp)

# df_AvgHrsWages_ByGender_Provinces = ProvinceAnalysis(df_AvgHrsWages_ByGender, pd, np, pp)
# df_AvgHrsWages_2010_ByGender_Provinces = ProvinceAnalysis(df_AvgHrsWages_2010_ByGender, pd, np, pp)
df_AvgHrsWages_2013_ByGender_Provinces = ProvinceAnalysis(df_AvgHrsWages_2013_ByGender, pd, np, pp)
df_AvgHrsWages_2016_ByGender_Provinces = ProvinceAnalysis(df_AvgHrsWages_2016_ByGender, pd, np, pp)
df_AvgHrsWages_2019_ByGender_Provinces = ProvinceAnalysis(df_AvgHrsWages_2019_ByGender, pd, np, pp)

# df_AvgHrsWages_ByEducation_Provinces = ProvinceAnalysis(df_AvgHrsWages_ByEducation, pd, np, pp)
# df_AvgHrsWages_2010_ByEducation_Provinces = ProvinceAnalysis(df_AvgHrsWages_2010_ByEducation, pd, np, pp)
df_AvgHrsWages_2013_ByEducation_Provinces = ProvinceAnalysis(df_AvgHrsWages_2013_ByEducation, pd, np, pp)
df_AvgHrsWages_2016_ByEducation_Provinces = ProvinceAnalysis(df_AvgHrsWages_2016_ByEducation, pd, np, pp)
df_AvgHrsWages_2019_ByEducation_Provinces = ProvinceAnalysis(df_AvgHrsWages_2019_ByEducation, pd, np, pp)

# df_AvgHrsWages_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgHrsWages_ByImmigrant, pd, np, pp)
# df_AvgHrsWages_2010_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgHrsWages_2010_ByImmigrant, pd, np, pp)
df_AvgHrsWages_2013_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgHrsWages_2013_ByImmigrant, pd, np, pp)
df_AvgHrsWages_2016_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgHrsWages_2016_ByImmigrant, pd, np, pp)
df_AvgHrsWages_2019_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgHrsWages_2019_ByImmigrant, pd, np, pp)

# df_AvgHrsWages_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgHrsWages_ByIndigenous, pd, np, pp)
# df_AvgHrsWages_2010_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgHrsWages_2010_ByIndigenous, pd, np, pp)
# df_AvgHrsWages_2013_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgHrsWages_2013_ByIndigenous, pd, np, pp)
# df_AvgHrsWages_2016_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgHrsWages_2016_ByIndigenous, pd, np, pp)
# df_AvgHrsWages_2019_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgHrsWages_2019_ByIndigenous, pd, np, pp)

Filtered by provinces by "Average weekly hours worked"

In [93]:
# By Average annual wages and salaries worked categories by provinces.

# df_AvgWeekHrsWrked_ByAge_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_ByAge, pd, np, pp)
# df_AvgWeekHrsWrked_2010_ByAge_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2010_ByAge, pd, np, pp)
df_AvgWeekHrsWrked_2013_ByAge_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2013_ByAge, pd, np, pp)
df_AvgWeekHrsWrked_2016_ByAge_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2016_ByAge, pd, np, pp)
df_AvgWeekHrsWrked_2019_ByAge_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2019_ByAge, pd, np, pp)

# df_AvgWeekHrsWrked_ByGender_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_ByGender, pd, np, pp)
# df_AvgWeekHrsWrked_2010_ByGender_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2010_ByGender, pd, np, pp)
df_AvgWeekHrsWrked_2013_ByGender_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2013_ByGender, pd, np, pp)
df_AvgWeekHrsWrked_2016_ByGender_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2016_ByGender, pd, np, pp)
df_AvgWeekHrsWrked_2019_ByGender_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2019_ByGender, pd, np, pp)

# df_AvgWeekHrsWrked_ByEducation_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_ByEducation, pd, np, pp)
# df_AvgWeekHrsWrked_2010_ByEducation_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2010_ByEducation, pd, np, pp)
df_AvgWeekHrsWrked_2013_ByEducation_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2013_ByEducation, pd, np, pp)
df_AvgWeekHrsWrked_2016_ByEducation_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2016_ByEducation, pd, np, pp)
df_AvgWeekHrsWrked_2019_ByEducation_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2019_ByEducation, pd, np, pp)

# df_AvgWeekHrsWrked_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_ByImmigrant, pd, np, pp)
# df_AvgWeekHrsWrked_2010_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2010_ByImmigrant, pd, np, pp)
df_AvgWeekHrsWrked_2013_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2013_ByImmigrant, pd, np, pp)
df_AvgWeekHrsWrked_2016_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2016_ByImmigrant, pd, np, pp)
df_AvgWeekHrsWrked_2019_ByImmigrant_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2019_ByImmigrant, pd, np, pp)

# df_AvgWeekHrsWrked_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_ByIndigenous, pd, np, pp)
# df_AvgWeekHrsWrked_2010_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2010_ByIndigenous, pd, np, pp)
# df_AvgWeekHrsWrked_2013_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2013_ByIndigenous, pd, np, pp)
# df_AvgWeekHrsWrked_2016_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2016_ByIndigenous, pd, np, pp)
# df_AvgWeekHrsWrked_2019_ByIndigenous_Provinces = ProvinceAnalysis(df_AvgWeekHrsWrked_2019_ByIndigenous, pd, np, pp)

Filtered by provinces by "Hours Worked"

In [94]:
# By Hours workee and salaries worked categories by provinces.

# df_Hrs_Wrked_ByAge_Provinces = ProvinceAnalysis(df_Hrs_Wrked_ByAge, pd, np, pp)
# df_Hrs_Wrked_2010_ByAge_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2010_ByAge, pd, np, pp)
df_Hrs_Wrked_2013_ByAge_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2013_ByAge, pd, np, pp)
df_Hrs_Wrked_2016_ByAge_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2016_ByAge, pd, np, pp)
df_Hrs_Wrked_2019_ByAge_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2019_ByAge, pd, np, pp)

# df_Hrs_Wrked_ByGender_Provinces = ProvinceAnalysis(df_Hrs_Wrked_ByGender, pd, np, pp)
# df_Hrs_Wrked_2010_ByGender_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2010_ByGender, pd, np, pp)
df_Hrs_Wrked_2013_ByGender_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2013_ByGender, pd, np, pp)
df_Hrs_Wrked_2016_ByGender_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2016_ByGender, pd, np, pp)
df_Hrs_Wrked_2019_ByGender_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2019_ByGender, pd, np, pp)

# df_Hrs_Wrked_ByEducation_Provinces = ProvinceAnalysis(df_Hrs_Wrked_ByEducation, pd, np, pp)
# df_Hrs_Wrked_2010_ByEducation_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2010_ByEducation, pd, np, pp)
df_Hrs_Wrked_2013_ByEducation_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2013_ByEducation, pd, np, pp)
df_Hrs_Wrked_2016_ByEducation_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2016_ByEducation, pd, np, pp)
df_Hrs_Wrked_2019_ByEducation_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2019_ByEducation, pd, np, pp)

# df_Hrs_Wrked_ByImmigrant_Provinces = ProvinceAnalysis(df_Hrs_Wrked_ByImmigrant, pd, np, pp)
# df_Hrs_Wrked_2010_ByImmigrant_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2010_ByImmigrant, pd, np, pp)
df_Hrs_Wrked_2013_ByImmigrant_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2013_ByImmigrant, pd, np, pp)
df_Hrs_Wrked_2016_ByImmigrant_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2016_ByImmigrant, pd, np, pp)
df_Hrs_Wrked_2019_ByImmigrant_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2019_ByImmigrant, pd, np, pp)

# df_Hrs_Wrked_ByIndigenous_Provinces = ProvinceAnalysis(df_Hrs_Wrked_ByIndigenous, pd, np, pp)
# df_Hrs_Wrked_2010_ByIndigenous_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2010_ByIndigenous, pd, np, pp)
# df_Hrs_Wrked_2013_ByIndigenous_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2013_ByIndigenous, pd, np, pp)
# df_Hrs_Wrked_2016_ByIndigenous_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2016_ByIndigenous, pd, np, pp)
# df_Hrs_Wrked_2019_ByIndigenous_Provinces = ProvinceAnalysis(df_Hrs_Wrked_2019_ByIndigenous, pd, np, pp)


Filtered by provinces by "Number of jobs"

In [95]:
# By Number of jobs and salaries worked categories by provinces.

# df_NumOfJob_ByAge_Provinces = ProvinceAnalysis(df_NumOfJob_ByAge, pd, np, pp)
# df_NumOfJob_2010_ByAge_Provinces = ProvinceAnalysis(df_NumOfJob_2010_ByAge, pd, np, pp)
df_NumOfJob_2013_ByAge_Provinces = ProvinceAnalysis(df_NumOfJob_2013_ByAge, pd, np, pp)
df_NumOfJob_2016_ByAge_Provinces = ProvinceAnalysis(df_NumOfJob_2016_ByAge, pd, np, pp)
df_NumOfJob_2019_ByAge_Provinces = ProvinceAnalysis(df_NumOfJob_2019_ByAge, pd, np, pp)

# df_NumOfJob_ByGender_Provinces = ProvinceAnalysis(df_NumOfJob_ByGender, pd, np, pp)
# df_NumOfJob_2010_ByGender_Provinces = ProvinceAnalysis(df_NumOfJob_2010_ByGender, pd, np, pp)
df_NumOfJob_2013_ByGender_Provinces = ProvinceAnalysis(df_NumOfJob_2013_ByGender, pd, np, pp)
df_NumOfJob_2016_ByGender_Provinces = ProvinceAnalysis(df_NumOfJob_2016_ByGender, pd, np, pp)
df_NumOfJob_2019_ByGender_Provinces = ProvinceAnalysis(df_NumOfJob_2019_ByGender, pd, np, pp)

# df_NumOfJob_ByEducation_Provinces = ProvinceAnalysis(df_NumOfJob_ByEducation, pd, np, pp)
# df_NumOfJob_2010_ByEducation_Provinces = ProvinceAnalysis(df_NumOfJob_2010_ByEducation, pd, np, pp)
df_NumOfJob_2013_ByEducation_Provinces = ProvinceAnalysis(df_NumOfJob_2013_ByEducation, pd, np, pp)
df_NumOfJob_2016_ByEducation_Provinces = ProvinceAnalysis(df_NumOfJob_2016_ByEducation, pd, np, pp)
df_NumOfJob_2019_ByEducation_Provinces = ProvinceAnalysis(df_NumOfJob_2019_ByEducation, pd, np, pp)

# df_NumOfJob_ByImmigrant_Provinces = ProvinceAnalysis(df_NumOfJob_ByImmigrant, pd, np, pp)
# df_NumOfJob_2010_ByImmigrant_Provinces = ProvinceAnalysis(df_NumOfJob_2010_ByImmigrant, pd, np, pp)
df_NumOfJob_2013_ByImmigrant_Provinces = ProvinceAnalysis(df_NumOfJob_2013_ByImmigrant, pd, np, pp)
df_NumOfJob_2016_ByImmigrant_Provinces = ProvinceAnalysis(df_NumOfJob_2016_ByImmigrant, pd, np, pp)
df_NumOfJob_2019_ByImmigrant_Provinces = ProvinceAnalysis(df_NumOfJob_2019_ByImmigrant, pd, np, pp)

# df_NumOfJob_ByIndigenous_Provinces = ProvinceAnalysis(df_NumOfJob_ByIndigenous, pd, np, pp)
# df_NumOfJob_2010_ByIndigenous_Provinces = ProvinceAnalysis(df_NumOfJob_2010_ByIndigenous, pd, np, pp)
# df_NumOfJob_2013_ByIndigenous_Provinces = ProvinceAnalysis(df_NumOfJob_2013_ByIndigenous, pd, np, pp)
# df_NumOfJob_2016_ByIndigenous_Provinces = ProvinceAnalysis(df_NumOfJob_2016_ByIndigenous, pd, np, pp)
# df_NumOfJob_2019_ByIndigenous_Provinces = ProvinceAnalysis(df_NumOfJob_2019_ByIndigenous, pd, np, pp)

Filted by provinces by "Wages and Salaries"

In [96]:
# By Wages and Salaries worked categories by provinces.

# df_WagesAndSalaries_ByAge_Provinces = ProvinceAnalysis(df_WagesAndSalaries_ByAge, pd, np, pp)
# df_WagesAndSalaries_2010_ByAge_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2010_ByAge, pd, np, pp)
df_WagesAndSalaries_2013_ByAge_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2013_ByAge, pd, np, pp)
df_WagesAndSalaries_2016_ByAge_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2016_ByAge, pd, np, pp)
df_WagesAndSalaries_2019_ByAge_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2019_ByAge, pd, np, pp)

# df_WagesAndSalaries_ByGender_Provinces = ProvinceAnalysis(df_WagesAndSalaries_ByGender, pd, np, pp)
# df_WagesAndSalaries_2010_ByGender_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2010_ByGender, pd, np, pp)
df_WagesAndSalaries_2013_ByGender_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2013_ByGender, pd, np, pp)
df_WagesAndSalaries_2016_ByGender_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2016_ByGender, pd, np, pp)
df_WagesAndSalaries_2019_ByGender_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2019_ByGender, pd, np, pp)

# df_WagesAndSalaries_ByEducation_Provinces = ProvinceAnalysis(df_WagesAndSalaries_ByEducation, pd, np, pp)
# df_WagesAndSalaries_2010_ByEducation_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2010_ByEducation, pd, np, pp)
df_WagesAndSalaries_2013_ByEducation_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2013_ByEducation, pd, np, pp)
df_WagesAndSalaries_2016_ByEducation_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2016_ByEducation, pd, np, pp)
df_WagesAndSalaries_2019_ByEducation_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2019_ByEducation, pd, np, pp)

# df_WagesAndSalaries_ByImmigrant_Provinces = ProvinceAnalysis(df_WagesAndSalaries_ByImmigrant, pd, np, pp)
# df_WagesAndSalaries_2010_ByImmigrant_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2010_ByImmigrant, pd, np, pp)
df_WagesAndSalaries_2013_ByImmigrant_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2013_ByImmigrant, pd, np, pp)
df_WagesAndSalaries_2016_ByImmigrant_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2016_ByImmigrant, pd, np, pp)
df_WagesAndSalaries_2019_ByImmigrant_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2019_ByImmigrant, pd, np, pp)

# df_WagesAndSalaries_ByIndigenous_Provinces = ProvinceAnalysis(df_WagesAndSalaries_ByIndigenous, pd, np, pp)
# df_WagesAndSalaries_2010_ByIndigenous_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2010_ByIndigenous, pd, np, pp)
# df_WagesAndSalaries_2013_ByIndigenous_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2013_ByIndigenous, pd, np, pp)
# df_WagesAndSalaries_2016_ByIndigenous_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2016_ByIndigenous, pd, np, pp)
# df_WagesAndSalaries_2019_ByIndigenous_Provinces = ProvinceAnalysis(df_WagesAndSalaries_2019_ByIndigenous, pd, np, pp)

After data analysis of the data, the code below can output the result of the analysis.

Main class for outputting the analysis

In [97]:
# https://howtodoinjava.com/python-examples/python-print-to-file/


class OutputProvinceAnalysis:

    # Province :
    # -- ['Alberta',  'British Columbia',    'Canada' , 'Manitoba' , 'New Brunswick' 
    # 'Newfoundland and Labrador', 'Northwest Territories' , 'Nova Scotia' , 'Nunavut'
    # 'Ontario' , 'Prince Edward Island', 'Quebec', 'Saskatchewan', 'Yukon']

    def __init__(self, df, PC, yrs, pd, np, pp):

        self.df_output = df
        self.ProCode = PC
        self.YearOutput = yrs

    def OutputResult(self):
        print(str(self.YearOutput))
        self.df_output.outputList(self.ProCode, 20)
        self.df_output.outputAnalysis(self.ProCode)

    def OutputPandaProfiling(self):
        if self.YearOutput == '2010':
            print("Year 2010 is not valid at this moment")
            # self.df_output.outputPandaProfiling(self.ProCode,0,0)
        elif self.YearOutput == '2013':
            self.df_output.outputPandaProfiling(self.ProCode,0,3)
        elif self.YearOutput == '2016':
            self.df_output.outputPandaProfiling(self.ProCode,0,4)
        elif self.YearOutput == '2018':
            self.df_output.outputPandaProfiling(self.ProCode,0,5)
        else:
            print("Error!")

Filtering only Alberta.

Only finished "Average annual hours worked".

In [98]:
# # -- Alberta                     2193966.0   2031.450000   2695.836034  1080
# (Before)
# print("2016 - Overall")
# df_AvgAnnHrsWrk_above_2016_ByAge_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_above_2016_ByAge_Provinces.outputAnalysis(0)
# print("2016")
# df_AvgAnnHrsWrk_2016_ByAge_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2016_ByAge_Provinces.outputAnalysis(0)
# print("2018")
# df_AvgAnnHrsWrk_2018_ByAge_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2018_ByAge_Provinces.outputAnalysis(0)
# print("2020")
# df_AvgAnnHrsWrk_2020_ByAge_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2020_ByAge_Provinces.outputAnalysis(0)

# print("2016 - Overall")
# df_AvgAnnHrsWrk_above_2016_ByGender_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_above_2016_ByGender_Provinces.outputAnalysis(0)
# print("2016")
# df_AvgAnnHrsWrk_2016_ByGender_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2016_ByGender_Provinces.outputAnalysis(0)
# print("2018")
# df_AvgAnnHrsWrk_2018_ByGender_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2018_ByGender_Provinces.outputAnalysis(0)
# print("2020")
# df_AvgAnnHrsWrk_2020_ByGender_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2020_ByGender_Provinces.outputAnalysis(0)

# print("2016 - Overall")
# df_AvgAnnHrsWrk_above_2016_ByEducation_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_above_2016_ByEducation_Provinces.outputAnalysis(0)
# print("2016")
# df_AvgAnnHrsWrk_2016_ByEducation_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2016_ByEducation_Provinces.outputAnalysis(0)
# print("2018")
# df_AvgAnnHrsWrk_2018_ByEducation_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2018_ByEducation_Provinces.outputAnalysis(0)
# print("2020")
# df_AvgAnnHrsWrk_2020_ByEducation_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2020_ByEducation_Provinces.outputAnalysis(0)

# print("2016 - Overall")
# df_AvgAnnHrsWrk_above_2016_ByImmigrant_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_above_2016_ByImmigrant_Provinces.outputAnalysis(0)
# print("2016")
# df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces.outputAnalysis(0)
# print("2018")
# df_AvgAnnHrsWrk_2018_ByImmigrant_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2018_ByImmigrant_Provinces.outputAnalysis(0)
# print("2020")
# df_AvgAnnHrsWrk_2020_ByImmigrant_Provinces.outputList(0, 20)
# df_AvgAnnHrsWrk_2020_ByImmigrant_Provinces.outputAnalysis(0)

In [99]:
# # -- Alberta                     2193966.0   2031.450000   2695.836034  1080
# (After using the class)

ProCode = 0

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Alberta
       REF_DATE      GEO                                             Sector  \
32217      2013  Alberta                      Total non-profit institutions   
32224      2013  Alberta                      Total non-profit institutions   
32231      2013  Alberta                      Total non-profit institutions   
32238      2013  Alberta                      Total non-profit institutions   
32245      2013  Alberta                      Total non-profit institutions   
32252      2013  Alberta                      Total non-profit institutions   
32343      2013  Alberta  Total non-profit institutions excluding govern...   
32350      2013  Alberta  Total non-profit institutions excluding govern...   
32357      2013  Alberta  Total non-profit institutions excluding govern...   
32364      2013  Alberta  Total non-profit institutions excluding govern...   
32371      2013  Alberta  Total non-profit institutions excluding govern...   
32378      20

Panda Profiling for Final Result for Alberta

In [100]:
# ProCode = 0
# (Before)
# df_AvgAnnHrsWrk_above_2016_ByAge_Provinces.outputPandaProfiling(ProCode,0,2)
# df_AvgAnnHrsWrk_above_2016_ByGender_Provinces.outputPandaProfiling(ProCode,0,2)
# df_AvgAnnHrsWrk_above_2016_ByEducation_Provinces.outputPandaProfiling(ProCode,0,2)
# df_AvgAnnHrsWrk_above_2016_ByImmigrant_Provinces.outputPandaProfiling(ProCode,0,2)

# df_AvgAnnHrsWrk_2016_ByAge_Provinces.outputPandaProfiling(ProCode,0,3)
# df_AvgAnnHrsWrk_2016_ByGender_Provinces.outputPandaProfiling(ProCode,0,3)
# df_AvgAnnHrsWrk_2016_ByEducation_Provinces.outputPandaProfiling(ProCode,0,3)
# df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces.outputPandaProfiling(ProCode,0,3)

# df_AvgAnnHrsWrk_2018_ByAge_Provinces.outputPandaProfiling(ProCode,0,4)
# df_AvgAnnHrsWrk_2018_ByGender_Provinces.outputPandaProfiling(ProCode,0,4)
# df_AvgAnnHrsWrk_2018_ByEducation_Provinces.outputPandaProfiling(ProCode,0,4)
# df_AvgAnnHrsWrk_2018_ByImmigrant_Provinces.outputPandaProfiling(ProCode,0,4)

# df_AvgAnnHrsWrk_2020_ByAge_Provinces.outputPandaProfiling(ProCode,0,5)
# df_AvgAnnHrsWrk_2020_ByGender_Provinces.outputPandaProfiling(ProCode,0,5)
# df_AvgAnnHrsWrk_2020_ByEducation_Provinces.outputPandaProfiling(ProCode,0,5)
# df_AvgAnnHrsWrk_2020_ByImmigrant_Provinces.outputPandaProfiling(ProCode,0,5)

In [101]:
# After

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputPandaProfiling()

Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 15.17it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.32s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.57it/s]


File name will be saved under Average annual hours worked 2013 in Alberta.html


Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.12it/s, Completed]                       
Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.22s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.80it/s]


File name will be saved under Average annual hours worked 2016 in Alberta.html
Error!


Summarize dataset: 100%|██████████| 18/18 [00:00<00:00, 19.07it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.23s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.69it/s]


File name will be saved under Average annual hours worked 2013 in Alberta.html


Summarize dataset: 100%|██████████| 18/18 [00:00<00:00, 20.56it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [11:45<00:00, 705.99s/it]
Render HTML: 100%|██████████| 1/1 [00:01<00:00,  1.36s/it]


File name will be saved under Average annual hours worked 2016 in Alberta.html
Error!


Summarize dataset: 100%|██████████| 18/18 [00:02<00:00,  8.64it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:04<00:00,  4.18s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  3.05it/s]


File name will be saved under Average annual hours worked 2013 in Alberta.html


Summarize dataset: 100%|██████████| 18/18 [00:01<00:00, 16.21it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.04s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.95it/s]


File name will be saved under Average annual hours worked 2016 in Alberta.html
Error!


Summarize dataset: 100%|██████████| 18/18 [00:00<00:00, 20.70it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.07s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.76it/s]


File name will be saved under Average annual hours worked 2013 in Alberta.html


Summarize dataset: 100%|██████████| 18/18 [00:00<00:00, 22.11it/s, Completed]                       
Generate report structure: 100%|██████████| 1/1 [00:03<00:00,  3.32s/it]
Render HTML: 100%|██████████| 1/1 [00:00<00:00,  2.76it/s]

File name will be saved under Average annual hours worked 2016 in Alberta.html
Error!





Filtering only British Columbia.

Only finished "Average annual hours worked".

In [102]:
# -- British Columbia            2401296.0   2223.422222   2804.925187  1080

ProCode = 1

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in British Columbia
       REF_DATE               GEO  \
32847      2013  British Columbia   
32854      2013  British Columbia   
32861      2013  British Columbia   
32868      2013  British Columbia   
32875      2013  British Columbia   
32882      2013  British Columbia   
32973      2013  British Columbia   
32980      2013  British Columbia   
32987      2013  British Columbia   
32994      2013  British Columbia   
33001      2013  British Columbia   
33008      2013  British Columbia   
33099      2013  British Columbia   
33106      2013  British Columbia   
33113      2013  British Columbia   
33120      2013  British Columbia   
33127      2013  British Columbia   
33134      2013  British Columbia   
33225      2013  British Columbia   
33232      2013  British Columbia   

                                                  Sector  \
32847                      Total non-profit institutions   
32854                      Total non-profit institutio

Panda Profiling for Final Result for BC

In [103]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering if "GEO" levelled as "Canada" ONLY.

Only finished "Average annual hours worked".

In [104]:
# -- Canada                     18252439.0  16900.406481  22232.852533  1080

ProCode = 2 # Canada

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Canada
       REF_DATE     GEO                                             Sector  \
26547      2013  Canada                      Total non-profit institutions   
26554      2013  Canada                      Total non-profit institutions   
26561      2013  Canada                      Total non-profit institutions   
26568      2013  Canada                      Total non-profit institutions   
26575      2013  Canada                      Total non-profit institutions   
26582      2013  Canada                      Total non-profit institutions   
26673      2013  Canada  Total non-profit institutions excluding govern...   
26680      2013  Canada  Total non-profit institutions excluding govern...   
26687      2013  Canada  Total non-profit institutions excluding govern...   
26694      2013  Canada  Total non-profit institutions excluding govern...   
26701      2013  Canada  Total non-profit institutions excluding govern...   
26708      2013  Canada  T

Min/median/max : 1250.0 / 1542.0 / 1776.0
Skewnewss :  -0.23893215242347804
Total size :  45
2016

Grab the dataset only in Canada
       REF_DATE     GEO                                             Sector  \
52979      2016  Canada                      Total non-profit institutions   
52986      2016  Canada                      Total non-profit institutions   
53000      2016  Canada                      Total non-profit institutions   
53105      2016  Canada  Total non-profit institutions excluding govern...   
53112      2016  Canada  Total non-profit institutions excluding govern...   
53126      2016  Canada  Total non-profit institutions excluding govern...   
53231      2016  Canada  Non-profit institutions serving households (co...   
53238      2016  Canada  Non-profit institutions serving households (co...   
53252      2016  Canada  Non-profit institutions serving households (co...   
53357      2016  Canada                   Business non-profit institutions   
53364      

Panda Profiling for Final Result for GEO = Canada

In [105]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering only Manitoba.

Only finished "Average annual hours worked".

In [106]:
# -- Manitoba                     767802.0    710.927778    915.637659  1080

ProCode = 3

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Manitoba
       REF_DATE       GEO                                             Sector  \
30957      2013  Manitoba                      Total non-profit institutions   
30964      2013  Manitoba                      Total non-profit institutions   
30971      2013  Manitoba                      Total non-profit institutions   
30978      2013  Manitoba                      Total non-profit institutions   
30985      2013  Manitoba                      Total non-profit institutions   
30992      2013  Manitoba                      Total non-profit institutions   
31083      2013  Manitoba  Total non-profit institutions excluding govern...   
31090      2013  Manitoba  Total non-profit institutions excluding govern...   
31097      2013  Manitoba  Total non-profit institutions excluding govern...   
31104      2013  Manitoba  Total non-profit institutions excluding govern...   
31111      2013  Manitoba  Total non-profit institutions excluding govern...   


Panda Profiling for Final Result for Manitoba

In [107]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering only New Brunswick.

Only finished "Average annual hours worked".

In [108]:
# -- New Brunswick                359320.0    332.703704    530.962762  1080

ProCode = 4

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in New Brunswick
       REF_DATE            GEO  \
29067      2013  New Brunswick   
29074      2013  New Brunswick   
29081      2013  New Brunswick   
29088      2013  New Brunswick   
29095      2013  New Brunswick   
29102      2013  New Brunswick   
29193      2013  New Brunswick   
29200      2013  New Brunswick   
29207      2013  New Brunswick   
29214      2013  New Brunswick   
29221      2013  New Brunswick   
29228      2013  New Brunswick   
29319      2013  New Brunswick   
29326      2013  New Brunswick   
29333      2013  New Brunswick   
29340      2013  New Brunswick   
29347      2013  New Brunswick   
29354      2013  New Brunswick   
29445      2013  New Brunswick   
29452      2013  New Brunswick   

                                                  Sector  \
29067                      Total non-profit institutions   
29074                      Total non-profit institutions   
29081                      Total non-profit institutions   


       REF_DATE            GEO  \
28983      2013  New Brunswick   
28990      2013  New Brunswick   
29109      2013  New Brunswick   
29116      2013  New Brunswick   
29235      2013  New Brunswick   
29242      2013  New Brunswick   
29361      2013  New Brunswick   
29368      2013  New Brunswick   
29487      2013  New Brunswick   
29494      2013  New Brunswick   
37803      2014  New Brunswick   
37810      2014  New Brunswick   
37929      2014  New Brunswick   
37936      2014  New Brunswick   
38055      2014  New Brunswick   
38062      2014  New Brunswick   
38181      2014  New Brunswick   
38188      2014  New Brunswick   
38307      2014  New Brunswick   
38314      2014  New Brunswick   

                                                  Sector   Characteristics  \
28983                      Total non-profit institutions    Male employees   
28990                      Total non-profit institutions  Female employees   
29109  Total non-profit institutions excluding gove

Panda Profiling for Final Result for New Brunswick

In [109]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering only Newfoundland and Labrador.

Only finished "Average annual hours worked".

In [110]:
# -- Newfoundland and Labrador    315895.0    306.099806    482.634908  1032

ProCode = 5

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Newfoundland and Labrador
       REF_DATE                        GEO  \
27177      2013  Newfoundland and Labrador   
27184      2013  Newfoundland and Labrador   
27191      2013  Newfoundland and Labrador   
27198      2013  Newfoundland and Labrador   
27205      2013  Newfoundland and Labrador   
27212      2013  Newfoundland and Labrador   
27303      2013  Newfoundland and Labrador   
27310      2013  Newfoundland and Labrador   
27317      2013  Newfoundland and Labrador   
27324      2013  Newfoundland and Labrador   
27331      2013  Newfoundland and Labrador   
27338      2013  Newfoundland and Labrador   
27429      2013  Newfoundland and Labrador   
27436      2013  Newfoundland and Labrador   
27443      2013  Newfoundland and Labrador   
27450      2013  Newfoundland and Labrador   
27457      2013  Newfoundland and Labrador   
27464      2013  Newfoundland and Labrador   
27555      2013  Newfoundland and Labrador   
27562      2013  Newfou

                                  sum         mean    amin  median    amax  \
Characteristics                                                              
High school diploma and less  20037.0  1335.800000  1178.0  1305.0  1507.0   
Trade certificate             23932.0  1595.466667  1445.0  1615.0  1734.0   
University degree and higher  28061.0  1870.733333  1726.0  1876.0  2018.0   

                              size  
Characteristics                     
High school diploma and less    15  
Trade certificate               15  
University degree and higher    15  

Overall,
Sum :  72030.0
Mean :  1600.6666666666667
Min/median/max : 1178.0 / 1615.0 / 2018.0
Skewnewss :  -0.018789938513471514
Total size :  45
2013

Grab the dataset only in Newfoundland and Labrador
       REF_DATE                        GEO  \
27149      2013  Newfoundland and Labrador   
27156      2013  Newfoundland and Labrador   
27170      2013  Newfoundland and Labrador   
27275      2013  Newfoundland and Lab

Panda Profiling for Final Result for Newfoundland

In [111]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering only Northwest Territories.

Only finished "Average annual hours worked".

In [112]:
# -- Northwest Territories         42804.0     41.476744     51.817046  1032

ProCode = 6

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Northwest Territories
       REF_DATE                    GEO  \
34107      2013  Northwest Territories   
34114      2013  Northwest Territories   
34121      2013  Northwest Territories   
34128      2013  Northwest Territories   
34135      2013  Northwest Territories   
34142      2013  Northwest Territories   
34233      2013  Northwest Territories   
34240      2013  Northwest Territories   
34247      2013  Northwest Territories   
34254      2013  Northwest Territories   
34261      2013  Northwest Territories   
34268      2013  Northwest Territories   
34366      2013  Northwest Territories   
34373      2013  Northwest Territories   
34380      2013  Northwest Territories   
34387      2013  Northwest Territories   
34492      2013  Northwest Territories   
34499      2013  Northwest Territories   
34506      2013  Northwest Territories   
34513      2013  Northwest Territories   

                                                  Sector  \
3410

Panda Profiling for Final Result for Northwest Territories

In [113]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering only Nova Scotia.

Only finished "Average annual hours worked".

In [114]:
# -- Nova Scotia                  531805.0    492.412037    757.119411  1080

ProCode = 7

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Nova Scotia
       REF_DATE          GEO  \
28437      2013  Nova Scotia   
28444      2013  Nova Scotia   
28451      2013  Nova Scotia   
28458      2013  Nova Scotia   
28465      2013  Nova Scotia   
28472      2013  Nova Scotia   
28563      2013  Nova Scotia   
28570      2013  Nova Scotia   
28577      2013  Nova Scotia   
28584      2013  Nova Scotia   
28591      2013  Nova Scotia   
28598      2013  Nova Scotia   
28689      2013  Nova Scotia   
28696      2013  Nova Scotia   
28703      2013  Nova Scotia   
28710      2013  Nova Scotia   
28717      2013  Nova Scotia   
28724      2013  Nova Scotia   
28815      2013  Nova Scotia   
28822      2013  Nova Scotia   

                                                  Sector  \
28437                      Total non-profit institutions   
28444                      Total non-profit institutions   
28451                      Total non-profit institutions   
28458                      Total non-profit 

                                  sum         mean    amin  median    amax  \
Characteristics                                                              
High school diploma and less  19633.0  1308.866667  1192.0  1318.0  1429.0   
Trade certificate             24443.0  1629.533333  1512.0  1622.0  1805.0   
University degree and higher  26653.0  1776.866667  1673.0  1770.0  1894.0   

                              size  
Characteristics                     
High school diploma and less    15  
Trade certificate               15  
University degree and higher    15  

Overall,
Sum :  70729.0
Mean :  1571.7555555555555
Min/median/max : 1192.0 / 1622.0 / 1894.0
Skewnewss :  -0.358085565941755
Total size :  45
2016

Grab the dataset only in Nova Scotia
       REF_DATE          GEO  \
54869      2016  Nova Scotia   
54876      2016  Nova Scotia   
54890      2016  Nova Scotia   
54995      2016  Nova Scotia   
55002      2016  Nova Scotia   
55016      2016  Nova Scotia   
55121      201

Panda Profiling for Final Result for Nova Scotia

In [115]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering only Nunavut.

Only finished "Average annual hours worked".

In [116]:
# -- Nunavut                       14235.0     15.208333     14.752372   936

ProCode = 8

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Nunavut
       REF_DATE      GEO                                             Sector  \
34737      2013  Nunavut                      Total non-profit institutions   
34744      2013  Nunavut                      Total non-profit institutions   
34751      2013  Nunavut                      Total non-profit institutions   
34758      2013  Nunavut                      Total non-profit institutions   
34765      2013  Nunavut                      Total non-profit institutions   
34772      2013  Nunavut                      Total non-profit institutions   
34863      2013  Nunavut  Total non-profit institutions excluding govern...   
34870      2013  Nunavut  Total non-profit institutions excluding govern...   
34877      2013  Nunavut  Total non-profit institutions excluding govern...   
34884      2013  Nunavut  Total non-profit institutions excluding govern...   
34891      2013  Nunavut  Total non-profit institutions excluding govern...   
34898      20

       REF_DATE      GEO                                             Sector  \
87629      2019  Nunavut                      Total non-profit institutions   
87636      2019  Nunavut                      Total non-profit institutions   
87650      2019  Nunavut                      Total non-profit institutions   
87755      2019  Nunavut  Total non-profit institutions excluding govern...   
87762      2019  Nunavut  Total non-profit institutions excluding govern...   
87776      2019  Nunavut  Total non-profit institutions excluding govern...   
87881      2019  Nunavut  Non-profit institutions serving households (co...   
88007      2019  Nunavut                   Business non-profit institutions   
88133      2019  Nunavut                 Government non-profit institutions   
88140      2019  Nunavut                 Government non-profit institutions   
88154      2019  Nunavut                 Government non-profit institutions   
96449      2020  Nunavut                      Total 

Panda Profiling for Final Result for Nunavut

In [117]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtered by Ontario

Only finished "Average annual hours worked".

In [118]:
# -- Ontario                     6601634.0   6112.624074   7594.433779  1080

ProCode = 9

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Ontario
       REF_DATE      GEO                                             Sector  \
30327      2013  Ontario                      Total non-profit institutions   
30334      2013  Ontario                      Total non-profit institutions   
30341      2013  Ontario                      Total non-profit institutions   
30348      2013  Ontario                      Total non-profit institutions   
30355      2013  Ontario                      Total non-profit institutions   
30362      2013  Ontario                      Total non-profit institutions   
30453      2013  Ontario  Total non-profit institutions excluding govern...   
30460      2013  Ontario  Total non-profit institutions excluding govern...   
30467      2013  Ontario  Total non-profit institutions excluding govern...   
30474      2013  Ontario  Total non-profit institutions excluding govern...   
30481      2013  Ontario  Total non-profit institutions excluding govern...   
30488      20

Sum :  68408.0
Mean :  1520.1777777777777
Min/median/max : 1194.0 / 1533.0 / 1804.0
Skewnewss :  -0.3107123670831418
Total size :  45
2019

Grab the dataset only in Ontario
       REF_DATE      GEO                                             Sector  \
83219      2019  Ontario                      Total non-profit institutions   
83226      2019  Ontario                      Total non-profit institutions   
83240      2019  Ontario                      Total non-profit institutions   
83345      2019  Ontario  Total non-profit institutions excluding govern...   
83352      2019  Ontario  Total non-profit institutions excluding govern...   
83366      2019  Ontario  Total non-profit institutions excluding govern...   
83471      2019  Ontario  Non-profit institutions serving households (co...   
83478      2019  Ontario  Non-profit institutions serving households (co...   
83492      2019  Ontario  Non-profit institutions serving households (co...   
83597      2019  Ontario             

Panda Profiling for Final Result for Ontario

In [119]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering only Prince Edward Island.

Only finished "Average annual hours worked".

In [120]:
# -- Prince Edward Island          77931.0     75.514535    121.297367  1032

ProCode = 10

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Prince Edward Island
       REF_DATE                   GEO  \
27807      2013  Prince Edward Island   
27814      2013  Prince Edward Island   
27821      2013  Prince Edward Island   
27828      2013  Prince Edward Island   
27835      2013  Prince Edward Island   
27842      2013  Prince Edward Island   
27933      2013  Prince Edward Island   
27940      2013  Prince Edward Island   
27947      2013  Prince Edward Island   
27954      2013  Prince Edward Island   
27961      2013  Prince Edward Island   
27968      2013  Prince Edward Island   
28059      2013  Prince Edward Island   
28066      2013  Prince Edward Island   
28073      2013  Prince Edward Island   
28080      2013  Prince Edward Island   
28087      2013  Prince Edward Island   
28094      2013  Prince Edward Island   
28185      2013  Prince Edward Island   
28192      2013  Prince Edward Island   

                                                  Sector  \
27807                     

Panda Profiling for Final Result for Prince Edward Island

In [121]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering only Quebec.

Only finished "Average annual hours worked".

In [122]:
# -- Quebec                      4271657.0   3955.237963   5580.294544  1080

ProCode = 11

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Quebec
       REF_DATE     GEO                                             Sector  \
29697      2013  Quebec                      Total non-profit institutions   
29704      2013  Quebec                      Total non-profit institutions   
29711      2013  Quebec                      Total non-profit institutions   
29718      2013  Quebec                      Total non-profit institutions   
29725      2013  Quebec                      Total non-profit institutions   
29732      2013  Quebec                      Total non-profit institutions   
29823      2013  Quebec  Total non-profit institutions excluding govern...   
29830      2013  Quebec  Total non-profit institutions excluding govern...   
29837      2013  Quebec  Total non-profit institutions excluding govern...   
29844      2013  Quebec  Total non-profit institutions excluding govern...   
29851      2013  Quebec  Total non-profit institutions excluding govern...   
29858      2013  Quebec  T

<class 'pandas.core.frame.DataFrame'>
Index: 45 entries, 82589 to 100754
Data columns (total 8 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   REF_DATE         45 non-null     int64  
 1   GEO              45 non-null     object 
 2   Sector           45 non-null     object 
 3   Characteristics  45 non-null     object 
 4   Indicators       45 non-null     object 
 5   UOM              45 non-null     object 
 6   SCALAR_FACTOR    45 non-null     object 
 7   VALUE            45 non-null     float64
dtypes: float64(1), int64(1), object(6)
memory usage: 3.2+ KB
None

Grab the dataset only in Quebec
                                  sum         mean    amin  median    amax  \
Characteristics                                                              
High school diploma and less  19065.0  1271.000000  1188.0  1267.0  1365.0   
Trade certificate             23118.0  1541.200000  1423.0  1527.0  1700.0   
University degree and h

Panda Profiling for Final Result for Quebec

In [123]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering only Saskatchewan.

Only finished "Average annual hours worked".

In [124]:
# -- Saskatchewan                 650781.0    602.575000    876.896377  1080

ProCode = 12

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Saskatchewan
       REF_DATE           GEO  \
31587      2013  Saskatchewan   
31594      2013  Saskatchewan   
31601      2013  Saskatchewan   
31608      2013  Saskatchewan   
31615      2013  Saskatchewan   
31622      2013  Saskatchewan   
31713      2013  Saskatchewan   
31720      2013  Saskatchewan   
31727      2013  Saskatchewan   
31734      2013  Saskatchewan   
31741      2013  Saskatchewan   
31748      2013  Saskatchewan   
31839      2013  Saskatchewan   
31846      2013  Saskatchewan   
31853      2013  Saskatchewan   
31860      2013  Saskatchewan   
31867      2013  Saskatchewan   
31874      2013  Saskatchewan   
31965      2013  Saskatchewan   
31972      2013  Saskatchewan   

                                                  Sector  \
31587                      Total non-profit institutions   
31594                      Total non-profit institutions   
31601                      Total non-profit institutions   
31608                 

Panda Profiling for Final Result for Saskatchewan

In [125]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Filtering only Yukon.

Only finished "Average annual hours worked".

In [126]:
# -- Yukon                         16914.0     18.070513     20.188135   936

ProCode = 13

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Yukon
       REF_DATE    GEO                                             Sector  \
33477      2013  Yukon                      Total non-profit institutions   
33484      2013  Yukon                      Total non-profit institutions   
33491      2013  Yukon                      Total non-profit institutions   
33498      2013  Yukon                      Total non-profit institutions   
33505      2013  Yukon                      Total non-profit institutions   
33512      2013  Yukon                      Total non-profit institutions   
33603      2013  Yukon  Total non-profit institutions excluding govern...   
33610      2013  Yukon  Total non-profit institutions excluding govern...   
33617      2013  Yukon  Total non-profit institutions excluding govern...   
33624      2013  Yukon  Total non-profit institutions excluding govern...   
33631      2013  Yukon  Total non-profit institutions excluding govern...   
33638      2013  Yukon  Total non-profi

Panda Profiling for Final Result for Yukon

In [127]:
# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByAge_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByAge_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByAge_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByAge_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByGende_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByGender_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByGender_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByGender_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByEducation_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2010_ByEducation_Provinces, ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2013_ByEducation_Provinces, ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces, ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(df_AvgAnnHrsWrk_2019_ByEducation_Provinces, ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

Display all of the indicators, characteristics, and provinces will display too much to analysis.<br />
Therefore, I made custom code below to display output for speciic province and the indicators<br />
Characteristics (Age, gender, education, immigrant) still display all of them.<br />
To use, enter province name (string) and the indicator (in numeric 0-6).<br />

In [128]:
# https://realpython.com/python-raise-exception/
# https://www.geeksforgeeks.org/check-multiple-conditions-in-if-statement-python/

list_of_province = ['Alberta',  'British Columbia', 'Canada', 'Manitoba', 'New Brunswick', 'Newfoundland and Labrador', 
                    'Northwest Territories' , 'Nova Scotia' , 'Nunavut', 'Ontario' , 
                    'Prince Edward Island', 'Quebec', 'Saskatchewan', 'Yukon']
categorized_province = input('Enter your province name:')
categorized_province = categorized_province.lower()
categorized_province_analysis = "Invalid"
for x in list_of_province:
    if (x.lower() == categorized_province) :
        categorized_province_analysis = x
        break

list_indicator = ["Average annual hours worked",
             "Average annual wages and salaries",
             "Average hourly wage",
             "Average weekly hours worked",
             "Hours Worked", 
             "Number of jobs", 
             "Wages and Salaries"]
categorized_indicator = ""
categorized_indcator_analysis = "Invalid"

if categorized_province_analysis == "Invalid":
    print("Run this code again, this province doesn't exist.")
    print("End of Program")

else:
    print("Enter your indicator attributes, ")
    print("0. Average annual hours worked")
    print("1. Average annual wages and salaries")
    print("2. Average hourly wage")
    print("3. Average weekly hours worked")
    print("4. Hours Worked")
    print("5. Number of jobs")
    print("6. Wages and Salaries")
    categorized_indicator = input('Enter your indicator number:')
    try:
        categorized_indicator = int(categorized_indicator)
    except ValueError:
        if not type(categorized_indicator) is int:
            print("Run this code again, invalid integer")
            print("End of Program")
    try:
        categorized_indicator_analysis = list_indicator[categorized_indicator]
    except IndexError:
        print("Run this code again, invalid range (0-6)")
        print("End of Program")

    if categorized_province_analysis == "Invalid":
        print("End of Program")
    else:
        num = 0
        for x in list_of_province:
            if (categorized_province_analysis == x) :
                ProCode = num
                break
            else:
                num = num + 1
        if categorized_indicator == 0:
            categorized_indicator_analysis = []
            categorized_indicator_analysis.append("") # df_AvgAnnHrsWrk_2010_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2013_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2016_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2019_ByAge_Provinces)
            categorized_indicator_analysis.append("") # df_AvgAnnHrsWrk_2010_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2013_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2016_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2019_ByGender_Provinces)
            categorized_indicator_analysis.append("") # df_AvgAnnHrsWrk_2010_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2013_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2016_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2019_ByEducation_Provinces)
            categorized_indicator_analysis.append("") # df_AvgAnnHrsWrk_2010_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2013_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2016_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnHrsWrk_2019_ByImmigrant_Provinces)
        elif categorized_indicator == 1:
            categorized_indicator_analysis = []
            categorized_indicator_analysis.append("") # df_AvgAnnWages_2010_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2013_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2016_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2019_ByAge_Provinces)
            categorized_indicator_analysis.append("") # df_AvgAnnWages_2010_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2013_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2016_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2019_ByGender_Provinces)
            categorized_indicator_analysis.append("") # df_AvgAnnWages_2010_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2013_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2016_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2019_ByEducation_Provinces)
            categorized_indicator_analysis.append("") # df_AvgAnnWages_2010_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2013_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2016_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgAnnWages_2019_ByImmigrant_Provinces)
        elif categorized_indicator == 2:
            categorized_indicator_analysis = []
            categorized_indicator_analysis.append("") # df_AvgHrsWages_2010_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2013_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2016_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2019_ByAge_Provinces)
            categorized_indicator_analysis.append("") # df_AvgHrsWages_2010_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2013_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2016_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2019_ByGender_Provinces)
            categorized_indicator_analysis.append("") # df_AvgHrsWages_2010_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2013_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2016_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2019_ByEducation_Provinces)
            categorized_indicator_analysis.append("") # df_AvgHrsWages_2010_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2013_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2016_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgHrsWages_2019_ByImmigrant_Provinces)
        elif categorized_indicator == 3:
            categorized_indicator_analysis = []
            categorized_indicator_analysis.append("") # df_AvgWeekHrsWrked_2010_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2013_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2016_ByAge_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2019_ByAge_Provinces)
            categorized_indicator_analysis.append("") # df_AvgWeekHrsWrked_2010_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2013_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2016_ByGender_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2019_ByGender_Provinces)
            categorized_indicator_analysis.append("") # df_AvgWeekHrsWrked_2010_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2013_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2016_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2019_ByEducation_Provinces)
            categorized_indicator_analysis.append("") # df_AvgWeekHrsWrked_2010_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2013_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2016_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_AvgWeekHrsWrked_2019_ByImmigrant_Provinces)
        elif categorized_indicator == 4:
            categorized_indicator_analysis = []
            categorized_indicator_analysis.append("") # df_Hrs_Wrked_2010_ByAge_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2013_ByAge_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2016_ByAge_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2019_ByAge_Provinces)
            categorized_indicator_analysis.append("") # df_Hrs_Wrked_2010_ByGender_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2013_ByGender_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2016_ByGender_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2019_ByGender_Provinces)
            categorized_indicator_analysis.append("") # df_Hrs_Wrked_2010_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2013_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2016_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2019_ByEducation_Provinces)
            categorized_indicator_analysis.append("") # df_Hrs_Wrked_2010_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2013_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2016_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_Hrs_Wrked_2019_ByImmigrant_Provinces)
        elif categorized_indicator == 5:
            categorized_indicator_analysis = []
            categorized_indicator_analysis.append("") # df_NumOfJob_2010_ByAge_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2013_ByAge_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2016_ByAge_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2019_ByAge_Provinces)
            categorized_indicator_analysis.append("") # df_NumOfJob_2010_ByGender_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2013_ByGender_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2016_ByGender_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2019_ByGender_Provinces)
            categorized_indicator_analysis.append("") # df_NumOfJob_2010_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2013_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2016_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2019_ByEducation_Provinces)
            categorized_indicator_analysis.append("") # df_NumOfJob_2010_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2013_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2016_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_NumOfJob_2019_ByImmigrant_Provinces)
        elif categorized_indicator == 6:
            categorized_indicator_analysis = []
            categorized_indicator_analysis.append("") # df_WagesAndSalaries_2010_ByAge_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2013_ByAge_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2016_ByAge_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2019_ByAge_Provinces)
            categorized_indicator_analysis.append("") # df_WagesAndSalaries_2010_ByGender_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2013_ByGender_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2016_ByGender_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2019_ByGender_Provinces)
            categorized_indicator_analysis.append("") # df_WagesAndSalaries_2010_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2013_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2016_ByEducation_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2019_ByEducation_Provinces)
            categorized_indicator_analysis.append("") # df_WagesAndSalaries_2010_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2013_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2016_ByImmigrant_Provinces)
            categorized_indicator_analysis.append(df_WagesAndSalaries_2019_ByImmigrant_Provinces)
        else:
            pass
        print("The province name is "+categorized_province_analysis)
        print("The indicator name is "+list_indicator[categorized_indicator])
        print("From here, the ProCode will be "+str(ProCode))
        print("The array from categorized_indicator_analysis is")
        print(categorized_indicator_analysis)

Enter your indicator attributes, 
0. Average annual hours worked
1. Average annual wages and salaries
2. Average hourly wage
3. Average weekly hours worked
4. Hours Worked
5. Number of jobs
6. Wages and Salaries
The province name is Ontario
The indicator name is Average annual hours worked
From here, the ProCode will be 9
The array from categorized_indicator_analysis is
['', <__main__.ProvinceAnalysis object at 0x00000282670046D0>, <__main__.ProvinceAnalysis object at 0x0000028267004370>, <__main__.ProvinceAnalysis object at 0x0000028267004E80>, '', <__main__.ProvinceAnalysis object at 0x0000028267004A30>, <__main__.ProvinceAnalysis object at 0x0000028266B1A520>, <__main__.ProvinceAnalysis object at 0x00000282653A1F40>, '', <__main__.ProvinceAnalysis object at 0x0000028266E56400>, <__main__.ProvinceAnalysis object at 0x00000282660DBAF0>, <__main__.ProvinceAnalysis object at 0x0000028266B1E7C0>, '', <__main__.ProvinceAnalysis object at 0x000002826649F370>, <__main__.ProvinceAnalysis obj

Filtering based on ProCode and indicator given above. <br />
Must run the code above to run this code. <br />

In [129]:
# This code need code above.

# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[0], ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[1], ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[2], ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[3], ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[4], ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[5], ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[6], ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[7], ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[8], ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[9], ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[10], ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[11], ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[12], ProCode, "2010", pd, np, pp)
# df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[13], ProCode, "2013", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[14], ProCode, "2016", pd, np, pp)
df_Display_Output_Result.OutputResult()
df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[15], ProCode, "2019", pd, np, pp)
df_Display_Output_Result.OutputResult()

2013

Grab the dataset only in Ontario
       REF_DATE      GEO                                             Sector  \
30327      2013  Ontario                      Total non-profit institutions   
30334      2013  Ontario                      Total non-profit institutions   
30341      2013  Ontario                      Total non-profit institutions   
30348      2013  Ontario                      Total non-profit institutions   
30355      2013  Ontario                      Total non-profit institutions   
30362      2013  Ontario                      Total non-profit institutions   
30453      2013  Ontario  Total non-profit institutions excluding govern...   
30460      2013  Ontario  Total non-profit institutions excluding govern...   
30467      2013  Ontario  Total non-profit institutions excluding govern...   
30474      2013  Ontario  Total non-profit institutions excluding govern...   
30481      2013  Ontario  Total non-profit institutions excluding govern...   
30488      20

Panda Profiling based on ProCode and indicator given above. <br />
Must run the code above to run this code. <br />

In [130]:
# # df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[0], ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[1], ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[2], ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[3], ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[4], ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[5], ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[6], ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[7], ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[8], ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[9], ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[10], ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[11], ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()

# # df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[12], ProCode, "2010", pd, np, pp)
# # df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[13], ProCode, "2013", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[14], ProCode, "2016", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()
# df_Display_Output_Result = OutputProvinceAnalysis(categorized_indicator_analysis[15], ProCode, "2019", pd, np, pp)
# df_Display_Output_Result.OutputPandaProfiling()