# Key numbers on acceptance @ The Technical University of Denmark (DTU)
<p style="font-size: 20px; line-height: 26pt">Students accepted in field of study: software technology (computer science), and mathematics and technology undergraduate.</p>

Analysis created by Henriette Steenhoff, s134869

----
![start_screen.png](start_screen_2019.PNG)

<p style="font-size: 20px; line-height: 26pt">See how these statistics where previous year (tracking started in 2018)</p>

| Category | 2018 | 2019 | 2020 |
|---|---|---|---|
| Women in CS @ DTU       | 31   | 37   | - |
| Women graduating yearly | 1.5  | 1.85 | - |
| Women graduated in all  | 20   | 26   | - |


Previous images:
* Link to 2018 image: [2018](start_screen_2018.PNG)

----
# Basics

<p style="font-size: 20px; line-height: 26pt">All work here is based on data from the [DTU Study Data Warehouse](http://dtu-studiedatavarehus.ait.dtu.dk/Default.aspx) -- sadly this site is only available in Danish.</p>

<p style="font-size: 20px; line-height: 26pt">The fun starts in the [Plotting](#Plotting) section -- fast forward to this point if you are less interested in all the nitty gritty details.</p>

----

# Prerequisite 

In [3]:
# IMPORTS
import re
from urllib.request import urlopen
import json
import numpy as np
import pandas as pd

# Plotting tools
import plotly 
from IPython.display import Image 
import plotly.plotly as py
import plotly.graph_objs as go

# credentials from json file
with open("plotly_credentials.json", "r") as file:  
    creds = json.load(file)

# API access to plotting tools
plotly.tools.set_credentials_file(username=creds['username'], api_key =creds['password'])
import matplotlib.pyplot as plt
%matplotlib inline

# QUERY FOR HISTORIC DATA ON ACCEPTANCE, SOFTWARE TECHNOLOGY
request_url = 'http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_optag.aspx?aar=0&sem=e&udd=1&ret=12&kon=Alle&alder=0&nt=0&vd=0&land=0&region=0&kv=0&eks=0'

response = urlopen(request_url)
#temp_data = json.load(response)

I was hoping to be able to fetch all the data from the webpage with urllib in order to make the fetching of data easier. Because of the website setup this is more work than what I had time to.

In [4]:
# Fetching data directly from webpage works poorly -- only as instance, not object
# The information is available and can be fetched, but it requires one hell of a regular expression.
response.read(100)

b'\r\n\r\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/'

----
# Key numbers
<p style="font-size: 20px; line-height: 26pt">If you want to have a look at the numbers yourself directly in the database, follow the links below.</p>

<p style="font-size: 20px; line-height: 26pt">**Before you get started:**</p>

<p style="font-size: 20px; line-height: 26pt">In both URLs above I am fetching numbers on students graduated for all years from 2004 from civil engineering bachelor in Computer Science at DTU. At DTU getting a bachelor takes 3 years. This means that what one would hope, was that the number of students accepted in year `x` should be the same number of students which graduated in year `x+3` (ideally, but of course there will be drop-outs and change in field of study). There are no graduates from 2004-2006 as the field of study was made available from 2004, which means that the first students in software technology graduated in 2007. </p>

## Software Technology
<p style="font-size: 20px; line-height: 26pt">As fetching the URL content directly from the website works poorly (I would have hoped for an API/Open data solution, but nope), I have added the numbers on **students accepted** from the webpage manually from this query:</p>

* [`http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_optag.aspx?aar=0&sem=e&udd=1&ret=12&kon=Alle&alder=0&nt=0&vd=0&land=0&region=0&kv=0&eks=0`](http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_optag.aspx?aar=0&sem=e&udd=1&ret=12&kon=Alle&alder=0&nt=0&vd=0&land=0&region=0&kv=0&eks=0)

<p style="font-size: 20px; line-height: 26pt">For the **graduated students**, I used this query:</p>

* [`http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_faerdige.aspx?aar=0&ret=67&udd=1&kon=Alle&alder=0&nt=0&vd=0&land=0&region=&kv=0&eks=&stud=0`](http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_faerdige.aspx?aar=0&ret=67&udd=1&kon=Alle&alder=0&nt=0&vd=0&land=0&region=&kv=0&eks=&stud=0)

## Mathematics and Technology

*<span style="color:red">After the 18th of December numbers on students in field "Mathematics and Technology will also be added, as these students are found also to be computer science students</span>*

<p style="font-size: 20px; line-height: 26pt">Students **accepted** for all years Mathematics and Technology, DTU

<p style="font-size: 20px; line-height: 26pt">Sadly for MT, data are only available from 2013 and forward. </p>

* [`http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_optag.aspx?aar=0&sem=0&udd=1&ret=30&kon=Alle&alder=0&nt=0&vd=0&land=0&region=0&kv=0&eks=0`](http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_optag.aspx?aar=0&sem=0&udd=1&ret=30&kon=Alle&alder=0&nt=0&vd=0&land=0&region=0&kv=0&eks=0)

<p style="font-size: 20px; line-height: 26pt">**Students graduated**:</p>

* [`http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_faerdige.aspx?aar=0&ret=18&udd=1&kon=Alle&alder=0&nt=0&vd=0&land=0&region=&kv=0&eks=&stud=0`](http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_faerdige.aspx?aar=0&ret=18&udd=1&kon=Alle&alder=0&nt=0&vd=0&land=0&region=&kv=0&eks=&stud=0)


<p style="font-size: 20px; line-height: 26pt">In all URLs above I am fetching numbers on students graduated for all years from 2004 from civil engineering bachelor in Computer Science at DTU.</p>


----
## From number to graphs with code 
<p style="font-size: 20px; line-height: 26pt">I will comment some more on the numbers further down.</p>

### Software Technology
#### Accepted

In [46]:
# Percentage male/female accepted on undergrad Software technology at DTU

YEAR         = list(range(2004,2019))

# Percentage accepted female/male
DTU_CS_P_F   = np.asarray([5, 2, 2, 0,  10,5, 9, 7, 7, 8, 12,7, 10,19,17])
DTU_CS_P_M   = np.asarray([95,98,98,100,90,95,91,93,93,92,88,93,90,81,83])

# All accepted, total
CS_YEARLY_ACCEPTED = np.asarray([56,66,62,63,62,55,54,56,55,61,68,68,63,80,88])

# Number of people accepted (calculated from the above)
no_females = np.round(np.multiply(DTU_CS_P_F,np.asarray(CS_YEARLY_ACCEPTED/100.0)))
no_males   = np.round(np.multiply(DTU_CS_P_M,np.asarray(CS_YEARLY_ACCEPTED/100.0)))

#### Graduated

In [47]:
GRAD_CS_F = np.asarray([0,0,0,0,5, 0, 0, 2, 1, 0, 2, 0, 2, 8, 6])
GRAD_CS_M = np.asarray([0,0,0,9,40,39,42,33,39,41,42,47,48,55,50])

### Mathematics and Technology

#### Accepted

In [48]:
DTU_MT_P_F = np.asarray([32,24,31,30,24,23,39,21,34,36,38,25,37,25,30])
DTU_MT_P_M = np.asarray([68,76,69,70,76,77,61,79,66,64,62,75,63,75,70])

# All accepted, total
MT_YEARLY_ACCEPTED = np.asarray([34,41,49,53,62,61,64,61,62,61,65,69,67,72,86])


# Number of people
no_females_mt = np.round(np.multiply(DTU_MT_P_F, np.asarray(MT_YEARLY_ACCEPTED / 100.0)))
no_males_mt   = np.round(np.multiply(DTU_MT_P_M, np.asarray(MT_YEARLY_ACCEPTED / 100.0)))

#### Graduated

In [49]:
GRAD_MT_F = np.asarray([0,0,0,0,0,0,0,0,0,18,10,20,13,15,9])
GRAD_MT_M = np.asarray([0,0,0,0,0,0,0,0,0,31,27,31,29,35,29])

## Functions

<p style="font-size: 20px; line-height: 26pt">In order to make the plotting look a little more sleek (and minimize redundancy of code), I wrapped the plotting up in a function for bar charts with and without grouping. </p>

In [50]:
# Plotting bar chart with two groups
def plot2bar(bar1, name1, bar2, name2, x_in, title, xaxis, yaxis, filename_, rgba='rgba(255,128,0,0.9)'):
    trace1 = go.Bar(
        x    = x_in,
        y    = bar1,
        name = name1,
    )

    trace2 = go.Bar(
        x    = x_in,
        y    = bar2,
        name = name2,
        marker = dict(
            color = rgba,
        )
    )
 
    data = [trace1,trace2]
    layout= go.Layout(
        barmode='group',
        title=title,
        xaxis=dict(
            title=xaxis
        ),
        yaxis=dict(
            title=yaxis
        )
    )

    fig = go.Figure(data=data, layout=layout)
    py.iplot(fig, sharing='public', filename=filename_)

In [51]:
# Plotting bar chart - no grouping
def plotbar(bar, name1, x_in, title, xaxis, yaxis, filename_):
    trace = go.Bar(
        x    = x_in,
        y    = bar,
        name = name1,
    )
 
    data = [trace]
    layout= go.Layout(
        title = title,
        xaxis = dict(
        title = xaxis
        ),
        yaxis = dict(
            title = yaxis
        )
    )

    fig = go.Figure(data=data, layout=layout)
    py.iplot(fig, sharing='public', filename=filename_)

## Plotting ##
<p style="font-size: 20px; line-height: 26pt">This is where the fun starts!<p>

### Acceptance

In [52]:
plot2bar(DTU_CS_P_F, "Female", 
         DTU_CS_P_M, "Male", 
         YEAR, 
         "Accepted software technology, percentage", 
         "Year", "Percentage [%]",
        'accepted_percentage.png')

<div>
    <a href="https://plot.ly/~frksteenhoff2/139/" target="_blank" title="accepted_percentage.png" style="display: block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/139.png" alt="accepted_percentage.png" style="max-width: 100%;width: 600px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:139" src="https://plot.ly/embed.js" async></script>
</div>


In [66]:
plot2bar(no_females, "Female", 
         no_males, "Male", 
         YEAR, 
         "Accepted software technology, numbers", 
         "Year", "Frequency",
        'accepted_numbers_.png')

<div>
    <a href="https://plot.ly/~frksteenhoff2/191/" target="_blank" title="accepted_numbers_.png" style="display: block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/191.png" alt="accepted_numbers_.png" style="max-width: 100%;width: 600px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:191" src="https://plot.ly/embed.js" async></script>
</div>


In [67]:
plot2bar(DTU_MT_P_F, "Female", 
         DTU_MT_P_M, "Male", 
         YEAR, 
         "Accepted mathematics and technology, percentage", 
         "Year", "Percentage [%]",
        'accepted_percentage_mt_.png')

<div>
    <a href="https://plot.ly/~frksteenhoff2/193/" target="_blank" title="accepted_percentage_mt_.png" style="display: block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/193.png" alt="accepted_percentage_mt_.png" style="max-width: 100%;width: 600px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:193" src="https://plot.ly/embed.js" async></script>
</div>


<p style="font-size: 20px; line-height: 26pt">While it is clear to see that the female/male ratio is still off, there are quite a lot more girls accepted to mathematics than software technology.<p>

In [69]:
plot2bar(no_females_mt, "Female", 
         no_males_mt, "Male", 
         YEAR, 
         "Accepted mathematics and technology, numbers", 
         "Year", "Frequency",
        'accepted_numbers_mt_.png')

<div>
    <a href="https://plot.ly/~frksteenhoff2/195/" target="_blank" title="accepted_numbers_mt_.png" style="display: block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/195.png" alt="accepted_numbers_mt_.png" style="max-width: 100%;width: 600px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:195" src="https://plot.ly/embed.js" async></script>
</div>


----
#### Female students only

In [56]:
plot2bar(no_females, "Female, software",
         no_females_mt, "Female, mathematics",
        YEAR, "Females accepted mathemathics/software", 
        "Year", "Frequency", 
        'all_female_combined.png',
        rgba='rgba(0, 204, 0, .9)',
        )

<div>
    <a href="https://plot.ly/~frksteenhoff2/161/" target="_blank" title="all_female_combined.png" style="display: block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/161.png" alt="all_female_combined.png" style="max-width: 100%;width: 600px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:161" src="https://plot.ly/embed.js" async></script>
</div>


<p style="font-size: 20px; line-height: 26pt">
Without a doubt, female students within computer science choose math!<p>
<p style="font-size: 20px; line-height: 26pt">
There has definitely been an increase in women applying for the CS undergraduate at DTU, but only significantly in 2017 where almost 20% of the accepted software students are women.
<p>

## Graduated

In [57]:
plot2bar(GRAD_CS_F, "Female", 
         GRAD_CS_M, "Male", 
         YEAR, "Graduated, Software Technology", 
         "Year", "Frequency", 
         'graduated_numbers.png',
         rgba='rgba(160,32,240,0.9)')

<div>
    <a href="https://plot.ly/~frksteenhoff2/155/" target="_blank" title="graduated_numbers.png" style="display: block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/155.png" alt="graduated_numbers.png" style="max-width: 100%;width: 600px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:155" src="https://plot.ly/embed.js" async></script>
</div>


## LOOK AT THIS!

|   | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 |
|---|---|---|---|---|---|---|---|---|---|---|---|
|  # women graduated with a CS degree | 0 | 5 | 0 | 0 | 2 | 1 | 0 |2 | 0 | 2 | 8 | 6 |

> <p style="font-size: 30px; line-height: 36pt">Within the last 2 years more women have graduated with a degree in CS from DTU than the sum of all women graduating the previous 10 years! <span style="color: red">
*Women in software is a growing trend!*</span><p>

In [58]:
plot2bar(GRAD_MT_F, "Female", 
         GRAD_MT_M, "Male", 
         YEAR, "Graduated, Mathematics and Technology", 
         "Year", "Frequency", 
         'graduated_numbers_mt.png',
          rgba='rgba(160,32,140,0.9)')

<div>
    <a href="https://plot.ly/~frksteenhoff2/163/" target="_blank" title="graduated_numbers_mt.png" style="display: block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/163.png" alt="graduated_numbers_mt.png" style="max-width: 100%;width: 600px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:163" src="https://plot.ly/embed.js" async></script>
</div>


<p style="font-size: 20px; line-height: 26pt">Even though the maths field of study has only existed for 5 years, compared to software technology's 13, almost 4 times the number of women have graduated from mathematics within those year compared to software all time. </p>

# Is this a common trend?
### Number of women accepted on other universities

### Women accepted at the University of Copenhagen
![acceptance_ku_2018.png](acceptance_ku_2018.png) 
<p style="font-size: 20px; line-height: 26pt">Within the last couple of years there has been an increase in the number of women choosing a CS degree.</p>

source: https://di.ku.dk/Nyheder/2018/samfundsrelevante-temaer/IT-Branchen_Flere_kvinder_p__datalogi_2018.pdf


### Women accepted at The IT University of Copenhagen
![acceptance_itu_2019.png](acceptance_itu_2019.png)
<p style="font-size: 20px; line-height: 26pt">The number of women at ITU is slowly increasing but lies steadily around 35%.</p>

source: https://www.itu.dk/om-itu/organisation-tal-og-fakta/tal-og-fakta/noegletal/noegletal-uddannelse

# Summary

<p style="font-size: 20px; line-height: 26pt">Overall there is quite a large number of drop-outs among the students and the number of females who actually graduate is *very* low. Taking into account that quite a few students use an extra semester on completing their bachelor, fully predicting the exact graduation date cannot be done from this data. Even though the number of graduating females are low, there is in fact an increase in females graduating, let's hope this tendency continues.</p>

<p style="font-size: 20px; line-height: 26pt">**From 2004-2018, only 26 women have actually graduated with a degree in Computer Science, which means that on average 1.85 women graduate each year.** However from the data it can be seen that the number of women applying for an education within CS is growing.</p>

<p style="font-size: 20px; line-height: 26pt">Of all the women accepted up until 2016, only half of them have graduated.</p>