# Key numbers on acceptance @ The Technical University of Denmark (DTU)
Students accepted in field of study: *Software Technology*, *Mathematics and Technology* and *Artificial Intelligence and Data Science* undergraduate which are all categorized as fields of study within computer science.

Analysis created by Henriette Steenhoff, s134869

----

![start_screen.png](start_screen_2019.PNG)

See how these statistics where previous year (tracking started in 2018)

| Category | 2018 | 2019 | 2020 |
|---|---|---|---|
| Women in CS @ DTU       | 31   | 37   | - |
| Women graduating yearly | 1.5  | 1.85 | - |
| Women graduated in all  | 20   | 26   | - |


Previous images:
* Link to 2018 image: [2018](start_screen_2018.PNG)

----
# Basics

All work here is based on data from the [DTU Study Data Warehouse](http://dtu-studiedatavarehus.ait.dtu.dk/Default.aspx) -- sadly this site is only available in Danish.

The fun starts in the [Plotting](#Plotting) section -- fast forward to this point if you are less interested in all the nitty gritty details and coding.

----

# Prerequisites

In [1]:
# IMPORTS
import re
from urllib.request import urlopen
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
%matplotlib inline

# Own helper functions
from plotbar import plot2bar, plot3bar

# QUERY FOR HISTORIC DATA ON ACCEPTANCE, SOFTWARE TECHNOLOGY
request_url = 'http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_optag.aspx?aar=0&sem=e&udd=1&ret=12&kon=Alle&alder=0&nt=0&vd=0&land=0&region=0&kv=0&eks=0'

response = urlopen(request_url)
#temp_data = json.load(response)

*I was hoping to be able to fetch all the data from the webpage with urllib in order to make the fetching of data easier. Because of the website setup this is more work than what I had time to.*

In [None]:
# Fetching data directly from webpage works poorly -- only as instance, not object
# The information is available and can be fetched, but it requires one hell of a regular expression.
response.read(100)

----
# Key numbers
If you want to have a look at the numbers yourself directly in the database, follow the links below.

**Before you get started:**

At DTU getting a bachelor takes 3 years. This means that what one would hope, was that the number of students accepted in year `x` should be the same number of students which graduated in year `x+3`. Of course this is the best case scenario and of course there will be drop-outs and changes in field of study. There are no graduates from 2004-2006 as the field of study was made available from 2004, which means that the first students in software technology graduated in 2007. 

In all URLs below I am fetching numbers on students graduated from 2004 to present year from civil engineering bachelor in Computer Science or Mathematics and Technology and *Artificial Intelligence and Data Science (new in 2018)* at DTU. 

&nbsp;
## Software Technology
As fetching the URL content directly from the website works poorly (I would have hoped for an API/Open data solution, but nope), I have added the numbers on from the webpage manually from the queries below:

**Students accepted**

* [`http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_optag.aspx?aar=0&sem=e&udd=1&ret=12&kon=Alle&alder=0&nt=0&vd=0&land=0&region=0&kv=0&eks=0`](http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_optag.aspx?aar=0&sem=e&udd=1&ret=12&kon=Alle&alder=0&nt=0&vd=0&land=0&region=0&kv=0&eks=0)

**Students graduated:**

* [`http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_faerdige.aspx?aar=0&ret=67&udd=1&kon=Alle&alder=0&nt=0&vd=0&land=0&region=&kv=0&eks=&stud=0`](http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_faerdige.aspx?aar=0&ret=67&udd=1&kon=Alle&alder=0&nt=0&vd=0&land=0&region=&kv=0&eks=&stud=0)

&nbsp;

## Mathematics and Technology
Sadly for MT, data are only available from 2013 and onward. 

**Students accepted**: 

* [`http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_optag.aspx?aar=0&sem=0&udd=1&ret=30&kon=Alle&alder=0&nt=0&vd=0&land=0&region=0&kv=0&eks=0`](http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_optag.aspx?aar=0&sem=0&udd=1&ret=30&kon=Alle&alder=0&nt=0&vd=0&land=0&region=0&kv=0&eks=0)

**Students graduated**:

* [`http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_faerdige.aspx?aar=0&ret=18&udd=1&kon=Alle&alder=0&nt=0&vd=0&land=0&region=&kv=0&eks=&stud=0`](http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_faerdige.aspx?aar=0&ret=18&udd=1&kon=Alle&alder=0&nt=0&vd=0&land=0&region=&kv=0&eks=&stud=0)

&nbsp;

## Artificial Intelligence and Data Science
Currently only accepted students are available in the data warehouse.

**Students accepted**:

* [`http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_bestand.aspx?aar=2018&ret=32327&udd=1&kon=Alle&alder=0&nt=0&vd=0&land=0&region=&kv=0&eks=`](http://dtu-studiedatavarehus.ait.dtu.dk/vis_noegletal_bestand.aspx?aar=2018&ret=32327&udd=1&kon=Alle&alder=0&nt=0&vd=0&land=0&region=&kv=0&eks=)



----
## From number to graphs with code 
I will comment on the numbers in the plotting section. The code below could have been written more concise but for ledgibility it has been split into three parts.

### Software Technology
#### Accepted

In [2]:
# Percentage male/female accepted on undergrad Software technology at DTU

YEAR         = list(range(2004,2019))

# Percentage accepted female/male
DTU_CS_P_F   = np.asarray([5, 2, 2, 0,  10,5, 9, 7, 7, 8, 12,7, 10,19,17])
DTU_CS_P_M   = np.asarray([95,98,98,100,90,95,91,93,93,92,88,93,90,81,83])

# All accepted, total
CS_YEARLY_ACCEPTED = np.asarray([56,66,62,63,62,55,54,56,55,61,68,68,63,80,88])

# Number of people accepted (calculated from the above)
no_females = np.round(np.multiply(DTU_CS_P_F, np.asarray(CS_YEARLY_ACCEPTED / 100.0)))
no_males   = np.round(np.multiply(DTU_CS_P_M, np.asarray(CS_YEARLY_ACCEPTED / 100.0)))

print("accepted in numbers")
print("no female: ", no_females)
print("no male:   ", no_males)

accepted in numbers
no female:  [  3.   1.   1.   0.   6.   3.   5.   4.   4.   5.   8.   5.   6.  15.  15.]
no male:    [ 53.  65.  61.  63.  56.  52.  49.  52.  51.  56.  60.  63.  57.  65.  73.]


#### Graduated

In [3]:
GRAD_CS_F = np.asarray([0,0,0,0,5, 0, 0, 2, 1, 0, 2, 0, 2, 8, 6])
GRAD_CS_M = np.asarray([0,0,0,9,40,39,42,33,39,41,42,47,48,55,50])

### Mathematics and Technology

#### Accepted

In [4]:
DTU_MT_P_F = np.asarray([32,24,31,30,24,23,39,21,34,36,38,25,37,25,30])
DTU_MT_P_M = np.asarray([68,76,69,70,76,77,61,79,66,64,62,75,63,75,70])

# All accepted, total
MT_YEARLY_ACCEPTED = np.asarray([34,41,49,53,62,61,64,61,62,61,65,69,67,72,86])


# Number of people
no_females_mt = np.round(np.multiply(DTU_MT_P_F, np.asarray(MT_YEARLY_ACCEPTED / 100.0)))
no_males_mt   = np.round(np.multiply(DTU_MT_P_M, np.asarray(MT_YEARLY_ACCEPTED / 100.0)))

print("accepted in numbers")
print("no female: ", no_females_mt)
print("no male:   ", no_males_mt)

accepted in numbers
no female:  [ 11.  10.  15.  16.  15.  14.  25.  13.  21.  22.  25.  17.  25.  18.  26.]
no male:    [ 23.  31.  34.  37.  47.  47.  39.  48.  41.  39.  40.  52.  42.  54.  60.]


#### Graduated

In [5]:
GRAD_MT_F = np.asarray([0,0,0,0,0,0,0,0,0,18,10,20,13,15,9])
GRAD_MT_M = np.asarray([0,0,0,0,0,0,0,0,0,31,27,31,29,35,29])

### Artifical Intelligence and Data Science

#### Accepted

In [6]:
DTU_AI_P_F = np.asarray([0,0,0,0,0,0,0,0,0,0,0,0,0,0,19])
DTU_AI_P_M = np.asarray([0,0,0,0,0,0,0,0,0,0,0,0,0,0,81])

# All accepted, total
MT_YEARLY_ACCEPTED = np.asarray([0,0,0,0,0,0,0,0,0,0,0,0,0,0,42])


# Number of people
no_females_ai = np.round(np.multiply(DTU_AI_P_F, np.asarray(MT_YEARLY_ACCEPTED / 100.0)))
no_males_ai   = np.round(np.multiply(DTU_AI_P_M, np.asarray(MT_YEARLY_ACCEPTED / 100.0)))

print("accepted in numbers")
print("no female: ", no_females_ai)
print("no male:   ", no_males_ai)

accepted in numbers
no female:  [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  8.]
no male:    [  0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.  34.]


#### Graduated
*Will not be available before 2021/2022*

----
## Plotting ##
This is where the fun starts! For each of the plots I will give some comments on what I read from the data and in the end I will do a recap on what was found.<p>

### Acceptance - Software Technology

In [7]:
# Percentage
plot2bar(DTU_CS_P_F, "Female", 
         DTU_CS_P_M, "Male", 
         YEAR, 
         "Accepted software technology, percentage", 
         "Year", "Percentage [%]",
        'accepted_percentage.png')

# Frequency
plot2bar(no_females, "Female", 
         no_males, "Male", 
         YEAR, 
         "Accepted software technology, numbers", 
         "Year", "Frequency",
        'accepted_numbers_.png')

<div>
    <a href="https://plot.ly/~frksteenhoff2/139/" target="_blank" title="accepted_percentage.png" style="display: inline-block; text-align: right;"><img src="https://plot.ly/~frksteenhoff2/139.png" alt="accepted_percentage.png" style="max-width: 100%;width: 450px;"  width="500" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:139" src="https://plot.ly/embed.js" async></script>
    <a href="https://plot.ly/~frksteenhoff2/191/" target="_blank" title="accepted_numbers_.png" style="display: inline-block; text-align: left;"><img src="https://plot.ly/~frksteenhoff2/191.png" alt="accepted_numbers_.png" style="max-width: 100%;width: 450px;"  width="500" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:191" src="https://plot.ly/embed.js" async></script>
</div>


It is clear to see that the number of men accepted in software technology at DTU far outweighs the women. Even though the number of women is low, there is a slight increase.<p>

In general, since 2010 the number of students accepted into software technology at DTU is increasing. The number of women accepted has been low from 2004-2016 but in 2017 a significant increase can be seen -- an increase that is correspondingly high the following year.<p>


### Mathematics and Technology

In [None]:
# Percentage
plot2bar(DTU_MT_P_F, "Female", 
         DTU_MT_P_M, "Male", 
         YEAR, 
         "Accepted mathematics and technology, percentage", 
         "Year", "Percentage [%]",
        'accepted_percentage_mt_.png')

# Frequency
plot2bar(no_females_mt, "Female", 
         no_males_mt, "Male", 
         YEAR, 
         "Accepted mathematics and technology, numbers", 
         "Year", "Frequency",
        'accepted_numbers_mt_.png')

<div>
    <a href="https://plot.ly/~frksteenhoff2/193/" target="_blank" title="accepted_percentage_mt_.png" style="display: inline-block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/193.png" alt="accepted_percentage_mt_.png" style="max-width: 100%;width: 450px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:193" src="https://plot.ly/embed.js" async></script>

    <a href="https://plot.ly/~frksteenhoff2/195/" target="_blank" title="accepted_numbers_mt_.png" style="display: inline-block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/195.png" alt="accepted_numbers_mt_.png" style="max-width: 100%;width: 450px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:195" src="https://plot.ly/embed.js" async></script>

</div>

While it is clear to see that the female/male ratio is still off, there are quite a lot more girls accepted into mathematics than software technology.<p>

The number of students accepted into mathematics and techology seems to have been growing steadily from its starting point in 2004. With the varying total number accepted there does not seem to be any correlation between the number of women and men accepted.<p>


### Artificial Intelligence and Data Science

In [None]:
# Percentage 
plot2bar(DTU_AI_P_F, "Female", 
         DTU_AI_P_M, "Male", 
         YEAR, 
         "Accepted Artificial Intelligence and Data Science, percentage", 
         "Year", "Percentage [%]",
        'accepted_percentage_aip.png')

# Frequency
plot2bar(no_females_ai, "Female", 
         no_males_ai, "Male", 
         YEAR, 
         "Accepted Artificial Intelligence and Data Science, numbers", 
         "Year", "Frequency",
        'accepted_numbers_ai.png')

<div>
    <a href="https://plot.ly/~frksteenhoff2/211/" target="_blank" title="accepted_percentage_aip.png" style="display: inline-block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/211.png" alt="accepted_percentage_aip.png" style="max-width: 100%;width: 450px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:211" src="https://plot.ly/embed.js" async></script>
    <a href="https://plot.ly/~frksteenhoff2/201/" target="_blank" title="accepted_numbers_ai.png" style="display: inline-block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/201.png" alt="accepted_numbers_ai.png" style="max-width: 100%;width: 450px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:201" src="https://plot.ly/embed.js" async></script>
</div>



Already from the first year 20% of the students accepted into Artificial Intelligence and Data Science were female! More than 400 people applied but only 42 was accepted.

----
### Female students only

In [None]:
# Percentage
plot3bar(DTU_CS_P_F, "Female, software",
         DTU_MT_P_F, "Female, mathematics",
         DTU_AI_P_F, "Female, AI",
        YEAR, "Percentage of women accepted mathemathics/software/AI", 
        "Year", "Percentage [%]", 
        'all_female_combined3p.png',
        rgba='rgba(0, 204, 0, .9)',
        )
# Frequency
plot3bar(no_females, "Female, software",
         no_females_mt, "Female, mathematics",
         no_females_ai, "Female, AI",
        YEAR, "Frequency of women accepted mathemathics/software/AI", 
        "Year", "Frequency", 
        'all_female_combined3f.png',
        rgba='rgba(0, 204, 0, .9)',
        )

<div>
    <a href="https://plot.ly/~frksteenhoff2/207/" target="_blank" title="all_female_combined3p.png" style="display: inline-block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/207.png" alt="all_female_combined3p.png" style="max-width: 100%;width: 450px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:207" src="https://plot.ly/embed.js" async></script>
    <a href="https://plot.ly/~frksteenhoff2/209/" target="_blank" title="all_female_combined3f.png" style="display: inline-block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/209.png" alt="all_female_combined3f.png" style="max-width: 100%;width: 450px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:209" src="https://plot.ly/embed.js" async></script>
</div>


Aand some quick numbers

In [None]:
print("# software female graduates:    ", sum(GRAD_CS_F))  
print("# mathematics female graduates: ", sum(GRAD_MT_F))
print("ratio:                          ", round(sum(GRAD_MT_F)/sum(GRAD_CS_F),2))


Without a doubt, female students within computer science most often choose the math direction.<p>
Looking at the percentage AI is already where it took software 14 years to end up.

There has definitely been an increase in women applying for the software undergraduate, but only significantly in recent years (2017 and 2018) where 15% of the accepted software students are women. As can be seen in recent years, the number of women accepted in software is slowly closing in on the numbers within mathematics.
<p>

## Graduated

In [None]:
# Software
plot2bar(GRAD_CS_F, "Female", 
         GRAD_CS_M, "Male", 
         YEAR, "Graduated, Software Technology", 
         "Year", "Frequency", 
         'graduated_numbers.png',
         rgba='rgba(160,32,240,0.9)')

# Maths
plot2bar(GRAD_MT_F, "Female", 
         GRAD_MT_M, "Male", 
         YEAR, "Graduated, Mathematics and Technology", 
         "Year", "Frequency", 
         'graduated_numbers_mt.png',
          rgba='rgba(160,32,140,0.9)')

<div>
    <a href="https://plot.ly/~frksteenhoff2/155/" target="_blank" title="graduated_numbers.png" style="display: inline-block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/155.png" alt="graduated_numbers.png" style="max-width: 100%;width: 450px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:155" src="https://plot.ly/embed.js" async></script>
    <a href="https://plot.ly/~frksteenhoff2/163/" target="_blank" title="graduated_numbers_mt.png" style="display: inline-block; text-align: center;"><img src="https://plot.ly/~frksteenhoff2/163.png" alt="graduated_numbers_mt.png" style="max-width: 100%;width: 450px;"  width="600" onerror="this.onerror=null;this.src='https://plot.ly/404.png';" /></a>
    <script data-plotly="frksteenhoff2:163" src="https://plot.ly/embed.js" async></script>
</div>


I know, overall the number of female graduates from software is bleak, but.. 
## LOOK AT THIS!

|   | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 |
|---|---|---|---|---|---|---|---|---|---|---|---|
|  # women graduated with a CS degree | 0 | 5 | 0 | 0 | 2 | 1 | 0 |2 | 0 | 2 | 8 | 6 |

> Within the last 2 years more women have graduated with a degree in CS from DTU than the sum of all the women that graduated the previous 10 years! <span style="color: red">
*Women in software is a growing trend.*</span><p>

The number on graduated students for mathematics and technology can only be traced back to 2013. However, compared to software technology more than 3 times the number of women have graduated from mathematics within those years compared to software all time.

# Is this a common trend?
### Women accepted at the University of Copenhagen
![acceptance_ku_2018.png](acceptance_ku_2018.png) 
Within the last couple of years there has been an increase in the number of women choosing a CS degree.

source: https://di.ku.dk/Nyheder/2018/samfundsrelevante-temaer/IT-Branchen_Flere_kvinder_p__datalogi_2018.pdf


### Women accepted at The IT University of Copenhagen
![acceptance_itu_2019.png](acceptance_itu_2019.png)
The number of women at ITU is slowly increasing but lies steadily around 35%. Generally a lot higher than the number accepted at DTU.

source: https://www.itu.dk/om-itu/organisation-tal-og-fakta/tal-og-fakta/noegletal/noegletal-uddannelse