![](images/Header.png)

# How much time should a person who has studied engineering but has had no contact with the world of software devote to a master's degree in data science?

This is the amount of hours that a person who has studied engineering not related to the world of software (road engineering, mining, forestry, agronomists...) must dedicate to a master's degree in data science. 

The data have been extracted with the Pomodoro Pro application that has counted each of the hours spent on the subjects that are part of the master's degree as well as the subjects that are part of those that are considered leveling. The time is in minutes.

![](images/figure1.jpeg)

In [4]:
# Import libraries
import pandas as pd
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go

In [7]:
# Loading data
df = pd.read_csv('data/master_hours_total.csv',  sep=";")

# Table display
df

Unnamed: 0,Duration,Assigned task,Category,Type
0,4380,Data Warehouse,Programming,Levelling
1,7800,Aprendizaje por refuerzo,Programming,Master
2,3720,Arquitecturas de bases de datos no tradicionales,Miscellany,Master
3,7800,Deep Learning,Programming,Master
4,7320,Diseño y programación orientada a objetos,Programming,Levelling
5,7680,Diseño y uso de base de datos analíticas,Miscellany,Levelling
6,5160,Estadística avanzada,Miscellany,Master
7,2820,Fundamentos de la ciencia de datos,Miscellany,Master
8,4860,Fundamentos de programación con Python,Programming,Levelling
9,2940,Fundamentos de redes y arquitecturas,Miscellany,Levelling


### What is the proportion of time to be devoted to each subject within the master's degree?

In [3]:
# Plot the graph
fig = px.pie(df, values=' Duration',
             names=' Assigned task',
             color_discrete_sequence=px.colors.sequential.RdBu)

fig.update_layout(
    title_text='Total study time = {:.0f} minutes'.format(
        sum(df[' Duration'])))

fig.update_traces(hoverinfo='label+percent', textinfo='percent',
                  textfont_size=15, textposition='inside',
                  marker=dict(line=dict(color='#000000', width=2)))

fig.show()

NameError: name 'df' is not defined

In [79]:
# Plot two donnut graphs
fig = make_subplots(rows=1, cols=2, specs=[[{'type':'domain'},
                                            {'type':'domain'}]])
fig.add_trace(go.Pie(labels=df.Category, values=df[' Duration'].values,
                     name="About coding"),
              1, 1)
fig.add_trace(go.Pie(labels=df.Type, values=df[' Duration'].values,
                     name="About level"),
              1, 2)

fig.update_traces(hole=.4, hoverinfo="label+percent+name")

fig.update_layout(
    title_text="My background has never been related to software",
    # Add annotations in the center of the donut pies.
    annotations=[dict(text='Coding?',
                      x=0.18, y=0.5, font_size=20, showarrow=False),
                 dict(text='Level?',
                      x=0.82, y=0.5, font_size=20, showarrow=False)])
fig.show()

### How much time will I devote to the leveling courses and how much to the master's degree courses as such?

In [118]:
# Plot scaled graphs
fig = make_subplots(1, 2, specs=[[{'type':'domain'}, {'type':'domain'}]],
                    subplot_titles=['Levelling = {:.0f}h'.format(
                        sum(df[df['Type'] == 'Levelling'][' Duration'])/60),
                                    'Master = {:.0f}h'.format(
                        sum(df[df['Type'] == 'Master'][' Duration'])/60)])
fig.add_trace(go.Pie(labels=df[df['Type'] == "Levelling"][' Assigned task'],
                     values=df[df['Type'] == "Levelling"][' Duration'].values/60,
                     scalegroup='one',
                     name="Levelling"), 1, 1)
fig.add_trace(go.Pie(labels=df[df['Type'] == "Master"][' Assigned task'],
                     values=df[df['Type'] == "Master"][' Duration'].values/60,
                     scalegroup='one',
                     name="Master"), 1, 2)

fig.update_layout(title_text='Total master´s time = {:.0f}h'.format(
    sum(df[' Duration'])/60))
fig.show()

In [124]:
# Visual graph for Master's subjects
colors = ['gold', 'mediumturquoise', 'darkorange', 'lightgreen',
          'AliceBlue', 'Brown', 'Coral']

fig = go.Figure(data=[go.Pie(labels=df[df['Type'] == 'Master'][' Assigned task'],
                             values=(df[df['Type'] == 'Master'][' Duration']/60))])

fig.update_traces(hoverinfo='label+percent', textinfo='value+percent',
                  textfont_size=15, textposition='inside',
                  marker=dict(colors=colors, line=dict(color='#000000', width=2)))

fig.update_layout(
    title_text='Study time for master´s degree courses = {:.0f}h'.format(
        sum(df[df['Type'] == 'Master'][' Duration'])/60))

fig.show()

In [125]:
# Visual graph for levelling's subjects
colors = ['gold', 'mediumturquoise', 'darkorange', 'lightgreen',
          'AliceBlue', 'Brown', 'Coral']

fig = go.Figure(data=[go.Pie(labels=df[df['Type'] == 'Levelling'][' Assigned task'],
                             values=(df[df['Type'] == 'Levelling'][' Duration']/60))])

fig.update_traces(hoverinfo='label+percent', textinfo='value+percent',
                  textfont_size=15, textposition='inside',
                  marker=dict(colors=colors, line=dict(color='#000000', width=2)))

fig.update_layout(
    title_text='Study time for levelling´s degree courses = {:.0f}h'.format(
        sum(df[df['Type'] == 'Levelling'][' Duration'])/60))

fig.show()