# Module Information Scraper
This code is to scrape assessment details from UCD module-by-module. From there, we can find out how vulnerable UCD is to ChatGPT and other similar AI helpers. First we will need to import some packages to do this.

## Imports and Global Variables

In [1]:
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup
import re
import pathlib as Path
import html5lib
import json

Next we will need to set the path to the datasets that we will use. This currently pulls in a specific file, that of MODULES.csv, which has all collected module information for the school of Engineering and Architecture. However, this could easily be changed to analyze sub-schools or other schools.

In [2]:
#This is the directory that holds all our input datasets
dir_raw=Path.Path("Datasets")

#Read in the csv that has all of our modules, if desired
moduleCodes= dir_raw / "MODULES.csv"
modules=pd.read_csv(moduleCodes)

#Print the csv with all our modules
modules

Unnamed: 0,Code,Module,School
0,DSCY10060,"Energy, Climate Change & Policy",EEE
1,EEEN10010,Electronic and Electrical Engineering I,EEE
2,EEEN10020,Robotics Design Project,EEE
3,EEEN20010,Computer Engineering,EEE
4,EEEN20020,Electrical and Electronic Circuits,EEE
...,...,...,...
519,MEEN50050,Creative Thinking & Innovation,MME
520,MEEN50060,Research Techniques Space Eng,MME
521,MEEN50070,Industrial Research I,MME
522,MEEN50080,Industrial Research II,MME


The below function will be used to read in modules. This can be by school if required, and otherwise includes all modules.

In [3]:
def input_Modules(school= None, filename=None):
    #This is the directory that holds all our input datasets
    dir_raw=Path.Path("Datasets")
    
    #This is the dict that holds the filename for each school
    school_filenames={"Civil Engineering":"MODULES_CE.csv", \
                     "Mechanical & Materials Eng": "MODULES_MME.csv", \
                     "Chem & Bioprocess Engineering": "MODULES_CBE.csv", \
                     "Biosystems & Food Engineering": "MODULES_BFE.csv", \
                     "Architecture, Plan & Env Pol": "MODULES_APEP.csv", \
                     "Electrical & Electronic Eng": "MODULES_EEE.csv"}
    
    #If the school is not equal to none, do only modules from the set school
    if school != None:
        #Get the file for the school's modules
        moduleCodes = dir_raw / school_filenames[school]
        
        #Read in the desired module codes into a dataframe
        modules=pd.read_csv(moduleCodes)
        
        print(modules)
        #Return a list of the module codes
        return modules["Unnamed: 0"].iloc, None
        
    elif filename != None:
        #If the file is an excel sheet, check it out
        if filename.endswith("xlsx"):
            corelist, excelTable=excelListReader(filename)
            
            return corelist, excelTable
        else:
            print("FILENAME ERROR: check filename, make sure its in excel format")
            return None, None
    else:
        #Set the code to look at the csv that has all of our modules, if desired
        moduleCodes= dir_raw / "MODULES.csv"
        
        #Read in the desired module codes into a dataframe
        modules=pd.read_csv(moduleCodes)

        #Return a list of the module codes
        return modules["Code"].iloc, None

In [4]:
def input_Infohub(school=None):
    #This is the directory that holds all our input datasets
    dir_raw=Path.Path("Datasets")
    
    #This is a list of schools
    schools=["Civil Engineering", \
                     "Mechanical & Materials Eng", \
                     "Chem & Bioprocess Engineering", \
                     "Biosystems & Food Engineering", \
                     "Architecture, Plan & Env Pol", \
                     "Electrical & Electronic Eng"]
    #This is the dict that holds the filename for each school
    school_filenames={"Civil Engineering":"SCE Module places by Term 22 23.csv", \
                     "Mechanical & Materials Eng": "SMME Module places by Term 22 23.csv", \
                     "Chem & Bioprocess Engineering": "SCBE Module places by Term 22 23.csv", \
                     "Biosystems & Food Engineering": "SBFE Module places by Term 22 23.csv", \
                     "Architecture, Plan & Env Pol": "SAPEP Module places by Term 22 23.csv", \
                     "Electrical & Electronic Eng": "SEEE Module places by Term 22 23.csv"}
    
    #If the school is not set, loop through all of them
    if school == None:
        schoolHubs=[]
        for school in schools:
            print("Getting file on the School of %s at %s"  %(school, school_filenames[school]))
            
            #Find the file location
            dir_in= dir_raw / school_filenames[school]
            schoolHubs.append(pd.read_csv(dir_in))
            
    
        #Combine all the module details together
        infohub=pd.concat(schoolHubs)
    #Otherwise just get that one school
    else:
        dir_in= dir_raw / school_filenames[school]
        infohub=pd.read_csv(dir_in)
        
    #Get rid of any modules that end in a letter
    infohub["Module"]=infohub["Module"].apply(lambda x : None if(re.search(r'\d+$', x) == None) else x)
    infohub.dropna(subset=["Module"], inplace=True)
    
    return infohub

In [5]:
input_Infohub()

Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electronic Eng at SEEE Module places by Term 22 23.csv


Unnamed: 0,Module,Module Title,Semester,Module Coordinator,Total Places - Max,Total Places - Avail,Core - Max,Core - Avail,Gen Elective - Max,Gen Elective - Avail,In Program - Max,In Program - Avail,International - Max,International - Avail,1st Year Elective - Max,1st Year Elective - Avail,Open Learning - Max,Open Learning - Avail,ALERT
0,CVEN10040,Creativity in Design,Autumn,sarah.cotterill@ucd.ie,353,53,348,52,0,0,0,0,5,1,0,0,0,0,
1,CVEN10050,Intro to Civil & Envir Eng,Spring,shane.donohue@ucd.ie,80,52,40,33,5,3,0,0,0,0,35,16,0,0,
2,CVEN10060,Eng & Arch Structures 1,Spring,daniel.mccrum@ucd.ie,175,29,165,22,10,7,0,0,0,0,0,0,0,0,
3,CVEN20010,Mechanics of Solids I,Spring,abdollah.malekjafarian@ucd.ie,72,19,68,16,2,2,0,0,2,1,0,0,0,0,
6,CVEN20030,Environmental Eng Fundamentals,Autumn,sarah.cotterill@ucd.ie,70,9,70,9,0,0,0,0,0,0,0,0,0,0,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
86,EEEN50070,Industrial Research I,Autumn,paul.curran@ucd.ie,10,10,10,10,0,0,0,0,0,0,0,0,0,0,
87,EEEN50080,Industrial Research II,Autumn,paul.curran@ucd.ie,5,5,5,5,0,0,0,0,0,0,0,0,0,0,
88,EEEN50090,Industrial Research III,Autumn,paul.curran@ucd.ie,5,5,5,5,0,0,0,0,0,0,0,0,0,0,
89,EEEN50100,Nonlinear System Stability,Spring,federico.milano@ucd.ie,25,22,25,22,0,0,0,0,0,0,0,0,0,0,


The below function is used for reading in module codes that are stored in an Excel sheet. The standard paths for certain engineering qualifications is currently the main use for such a method. 

In [6]:
#This function reads in an excel sheet with module codes
def excelListReader(filename, excel_table=True):

    #Get the input file path
    coreCodes= dir_raw / filename

    #Make sure that it is in the desired excel table format
    if excel_table:
        #If it is, read in the excel sheet, and get the values in the "Module" column as a list
        coreModules=pd.read_excel(coreCodes)
        coreList=coreModules["Module"].values.tolist()
        
    else:
        #Return an error if the file is not in an excel sheet
        print("ERROR: not in excel table format")
        return None

    #print the module codes that we found, and then return them
    print(coreList)
    return coreList, coreModules

## Scraper Functions and Required Sub Functions

The module descriptor scraper pulls all module descriptor information from the UCD module website. This includes information such as who runs the module, and importantly for our analysis, the number of credits for each module.

In [7]:
#This pulls all module descriptor information from the publicly available UCD module website
def module_descriptor_scraper(url, level=None, school=None):

    #Get the HTML representation of the module page, the page being given by the URL
    request=requests.get(url)
    soup=BeautifulSoup(request.content, 'html.parser')

    #This will hold all items in the description list and associate them with their related element
    descriptor_list={}

    #Get all the elements in the "description list" - the 'dl'
    for element in soup.select('dl'):
        #Get the element text 
        credit_list=element.text
        
        #Taking the "Description Term", dt, and the "Description element", dd, as a pair
        for items in zip(soup.select('dt'), soup.select('dd')):
            #Create a dictionary item with the term and its associated element, to be turned into a series later
            descriptor_list[items[0].text]=items[1].text

    
    #Create a Series from the items in the description list
    module_descriptor=pd.Series(descriptor_list)
    #Make sure that the Credits column is numeric - if there is an error when changing to numeric, the value is set as None
    module_descriptor["Credits:"]=pd.to_numeric(module_descriptor["Credits:"], errors='coerce')
    
    #This implements filters as desired. If filtered changes to true, it means that the item is filtered out
    filtered=False
    online=False
    
    #If filters exist, check that the module is not filtered out
    if (level != None):
        filtered= (pd.to_numeric(module_descriptor["Level:"].split('(')[0], errors='ignore') != level)
                   
    #If it wasn't filtered out by level, check if it is filtered out by school
    if (filtered == False) and (school != None):
        filtered = (module_descriptor["School:"] != school)

    #Check if the module is delivered online or not
    if(module_descriptor["Mode of Delivery:"] == "Online"):
        online=True
        
    #Return the module descriptor and whether or not it was filtered out
    return module_descriptor, filtered, online

The below code is used to simply assert that the filtering worked, and is more of a sanity check than anything else.

In [8]:
#This asserts that the filter works correctly
def assert_filtered(module_descriptors, level=None, school=None):
    #Combine all descriptors into a dataframe
    all_descriptors=pd.concat(module_descriptors)
    
    #Make sure that IF the level was specified, only one level is allowed
    if level !=None:
        assert (all_descriptors["Level:"].nunique() == 1)
    
    #Make sure that only one school is allowed, IF it was specified
    if school != None:
        assert (all_descriptors["School:"].nunique() == 1)
        
    #Print all the unique schools scraped
    print("\n %s" %all_descriptors["School:"].unique())
    
    #Return the number of unique values - by school and level
    return all_descriptors["School:"].nunique(), all_descriptors["Level:"].nunique()

Below is a helper function. This saves files as desired after their information has been taken from the UCD website. 

In [9]:
def save_module_files(module_assessments, module_descriptors, codeList=None, level=None, school=None, foldername=None):
    #The directory to save outputs to
    dir_output=Path.Path("ModuleInformation")
    dir_output.mkdir(parents=True, exist_ok=True)
    
    subdirectory=""
    #Save the file in its desired format
    if level != None:
        subdirectory+="Level=%d" %(level)
        
    if school != None:
        subdirectory+="_School="+school.replace(" ", "-")
    
    if codeList != None:
        subdirectory+="SelectedModules"
        
    if foldername != None:
        subdirectory=foldername
        
    #if the modules have been filtered, and thus belong in a sub directory, make that directory
    if len(subdirectory) > 0:
        dir_output=dir_output / subdirectory
        dir_output.mkdir(parents=True, exist_ok=True)
        
   
        
    #Save our two module detail files
    with open(dir_output / "assessments.json", 'w') as outfile:
        if (len(module_assessments)) > 2 and (isinstance(module_assessments, list)):
            module_assessments=pd.concat(module_assessments, ignore_index=True)
            print("saving to %s" % dir_output)
        outfile.write(module_assessments.to_json())
        
    with open(dir_output / "descriptors.json", 'w') as outfile:
        if (len(module_descriptors) > 2) and (isinstance(module_descriptors, list)):
            module_descriptors=pd.DataFrame(module_descriptors)
            print("saving to %s" % dir_output)
        outfile.write(module_descriptors.to_json())

Below is a dict that will be used to develop a custom column, "work type", which will be necessary for further analysis later.

In [10]:
work_type={"Assignment" :"At home", \
                "Attendance": "In person", \
                "Class Test" : "In person", \
                "Continuous Assessment": "At home", \
               "Essay": "At home", \
                "Examination": "In person", \
                "Fieldwork": "In person", \
                "Group Project": "Blended", \
                "Journal": "Blended",\
               "Lab Report": "Blended", \
                "Multiple Choice Questionnaire": "Blended", \
                "Oral Examination": "In person", \
               "Portfolio" : "Blended",  \
                "Practical Examination": "In person", \
                "Presentation" : "In person", \
                "Project": "At home", \
               "Seminar": "In person", \
               "Studio Examination" : "In person",\
               "Assessments worth <2%": "Unknown"}

The below code collects all module assessment and module descriptor information into two lists. It also creates a "Scaled % of Final Grade" column in the asssessment table. This weights the assessment based on the number of credits the module has overall. In this way, the median and normal amount of credits, 5.0, has assessments weighting that add up to 100%. Those above and below are given assessment weightings that scale with how much more or less they are worth then a normal module - a 10 credit module will have assessments that add up to 200%, because they are worth twice the amount as a normal module.

Error module details are stored for inspection later, to see why they occurred. The code continues on even if errors occur, after having stored these details it simply proceeds to the next module.

There are two sources of information that we are using, the module codes available from the search function, and the previous academic year module records. Experimentation proved that the previous academic year records include some modules that have  since been removed - while all the module codes currently available from the search function are included in it. Therefore, the module codes is a subset of the previous academic year records.

### The Collector Function - combines sub functions to collect all available information on desired modules.

In [11]:
#This functiom will allow school and year functions to be placed on it
def collector(codeList=None, level=None, school=None, filename=None, foldername=None):
    #This will store module information
    module_assessments=[]
    module_descriptors=[]

    #This will store error module information
    error_modules=[]
    error_module_descriptors=[]

    #Next we need to get our moduleCodes
    moduleCodes
    
    #Pick where to get the module codes from
    if codeList!=None:
        modulesCodes=codeList
    else:
        modulesCodes, excelTable=input_Modules(school=school, filename=filename)
        
    #Get the previous academic year records
    infohub=input_Infohub(school=school)
        
    #Going through the modules one-by-one    
    for i in modulesCodes:
        #Get the associated previous academic year record
        
            
        #Let the user know we iterated
        print(".",end="")
        
        #Change the URL to finish with the desired module code
        url= "https://hub.ucd.ie/usis/!W_HU_MENU.P_PUBLISH?p_tag=MODULE&MODULE=" + i
    
        #Get the module descriptor
        descriptor, filtered, online=module_descriptor_scraper(url, level=level, school=school)
        #If the module is in violation of the filters, continue to the next without saving
        if filtered==True:
            continue
            
        #Use pandas to read in the asssessment html table. This starts with the word 'Description', 
        #which is how we differentiate it from the other tables on the webpage
        table=pd.read_html(url, match="Description")
        
        #Get the first table, and turn it into a dataframe
        df=pd.DataFrame(table[0])
        #Create the "Assessment Type" column. There are 18 assessment types across UCD
        df["Assessment Type"] = df['Description'].str.split(':').str[0]
        #Add in the module code column to both the assessment dataframe and the descriptor list
        df["Module Code"]=i
        descriptor["Module Code"]=i

        #Try and create a column where the grade is scaled by credits worth, with 5 credits being the normal
        try:
            df["Scaled % of Final Grade"]= df['% of Final Grade'].apply(lambda x: x * (descriptor["Credits:"]/5.0))
            
        #If the scaling didn't work, this is an error module. Save it as an error module and continue
        except:
            print("\nERROR MODULE DETECTED.")
            print("Module may need to be inspected, saving information as an error module and continuing without it")
        
            error_modules.append(df)
            error_module_descriptors.append(descriptor)
            continue
            
        #Add a new column, "Work Type". This is based on the assessment type, and should help inform us of how big the risk of 
        #ChatGPT in assessment is.
        #If the module is delivered online, the work type is only "At home", owing to the inherent risk to ChatGPT of these
        #modules. Otherwise, set the assessment type according to the provided dict
        if online:
            df["Work Type"]="At home"
        else:
            #Replace the (short) MCQ with just MCQ for simplicity
            df=df.replace("Multiple Choice Questionnaire (Short)", "Multiple Choice Questionnaire")
            df["Work Type"]=df["Assessment Type"].apply(lambda x: work_type[x])
        
        #Add a few extra columns onto the dataFrame, so that we could make an interactive graph later
        df["Level"]=descriptor["Level:"]
        df["Credits"]=descriptor["Credits:"]
        df["School"]=descriptor["School:"]
        df["Module Coordinator"]=descriptor["Module Coordinator:"]
        
        #Add the records from the previous academic year
        prev_record=infohub[infohub["Module"] == i]
        #If the records from the previous year exist
        if not prev_record.empty:
            df["Semester"]=prev_record["Semester"].iloc[0]
            df["Enrolled Students 22/23"]=prev_record["Total Places - Max"].iloc[0]\
            -prev_record["Total Places - Avail"].iloc[0]
            df["Module Title"]= prev_record["Module Title"].iloc[0]
        else:
            df["Semester"]=None
            df["Module Title"]=None
            df["Enrolled Students 22/23"]=None
        
        #Add the stage if we know it
        if filename == None:
            df["Stage"]=None
        else:
            df["Stage"]=excelTable[excelTable["Module"] == i]["Stage"].iloc[0]
            #df["Credits"]=excelTable[excelTable["Module"] == i]["Credits"].iloc[0]
            
            
        #Append the module information dataframes to their respective lists
        module_assessments.append(df)
        module_descriptors.append(descriptor)
        
        #Save the individual module files
        save_module_files(df, descriptor, foldername="IndividualModules/%s" %i)
    
    #This asserts that the filters were properly imposed, if imposed at all
    num_schools, num_levels=assert_filtered(module_descriptors, level, school)

    #Inform the user that we have finished
    print("\nFINISHED, SCRAPED DETAILS ON %d MODULES, OVER %d SCHOOLS AND %d LEVELS" \
          %(len(module_assessments), num_schools, num_levels))
    
    #Save the output files
    save_module_files(module_assessments, module_descriptors, codeList=codeList, level=level, school=school, \
                      foldername=foldername)
    
    #Return the desired variables, the list of module assessment and descriptor dataframes, as well as the error dataframes
    return module_assessments, module_descriptors, error_modules, error_module_descriptors

Having defined the collector function, we simply now need to run it. The collector function is a general function. It can use filters in a number of ways as required. Filters can:
- Limit the school
- Limit the level
- Limit to a module list (mainly used for possible engineering paths)

It automatically saves the collected and scraped data in files when done. It also collects all "error modules" for later visual inspection. These are modules that have some anamoly that necessitates them not being included with the rest of the modules, and a detection message is thrown every time they are found.

Now we will run the collector function
## Collecting Data:
### Collecting Data on All Modules from the College of Engineering and Architecture

In [12]:
#Run the above function in its base form
module_assessments, module_descriptors, ALL_error_modules, ALL_error_module_descriptors=collector()

Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electronic Eng at SEEE Module places by Term 22 23.csv
.......................................................................................................................
ERROR MODULE DETECTED.
Module may need to be inspected, saving information as an error module and continuing without it
..........................................................................................................................
ERROR MODULE DETECTED.
Module may need to be inspected,

### Collecting Data by Standard Undergraduate Paths

In [13]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Electronic_UG_Modules.xlsx", foldername="ElectronicPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'COMP10060', 'EEEN10020', 'EEEN20010', 'EEEN20020', 'EEEN20050', 'EEEN20070', 'MATH20290', 'EEEN20030', 'EEEN20040', 'EEEN20060', 'EEEN20090', 'STAT20060', 'SCI20020', 'COMP20200', 'ACM30030', 'COMP20080', 'EEEN30020', 'EEEN30110', 'EEEN30030', 'EEEN30050', 'EEEN30120', 'EEEN30150', 'EEEN30190', 'EEEN30060', 'EEEN30160', 'COMP20180']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electr

In [14]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Electronic_UG_Modules.xlsx", foldername="ElectricalPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'COMP10060', 'EEEN10020', 'EEEN20010', 'EEEN20020', 'EEEN20050', 'EEEN20070', 'MATH20290', 'EEEN20030', 'EEEN20040', 'EEEN20060', 'EEEN20090', 'STAT20060', 'SCI20020', 'COMP20200', 'ACM30030', 'COMP20080', 'EEEN30020', 'EEEN30110', 'EEEN30030', 'EEEN30050', 'EEEN30120', 'EEEN30150', 'EEEN30190', 'EEEN30060', 'EEEN30160', 'COMP20180']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electr

In [15]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Architecture_UG_Modules.xlsx", foldername="ArchitecturePath")

['ARCT10010', 'ARCT10030', 'ARCT10070', 'ARCT10120', 'ARCT10020', 'ARCT10040', 'ARCT10090', 'CVEN10060', 'ARCT20020', 'ARCT20040', 'ARCT20050', 'ARCT20130', 'ARCT20010', 'ARCT20100', 'CVEN20040', 'ARCT20170', 'ARCT30010', 'ARCT30030', 'ARCT30090', 'CVEN30100', 'ARCT30040', 'ARCT30100', 'ARCT30130', 'ARCT20570']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electronic Eng at SEEE Module places by Term 22 23.csv
........................
 ['Architecture, Plan & Env Pol' 'Civil Engineering']

FINISHED, SCRAPED DETAILS

In [16]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Mechanical_UG_Modules.xlsx", foldername="MechanicalPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'COMP10060', 'DSCY10060', 'EEEN20020', 'MATH20290', 'MEEN20010', 'MEEN20020', 'MEEN20050', 'MEEN20030', 'MEEN20040', 'MEEN20060', 'MEEN20070', 'STAT20060', 'COMP20080', 'SCI20020', 'ACM30030', 'MEEN30030', 'MEEN30040', 'MEEN30090', 'MEEN30100', 'EEEN20090', 'EEEN30150', 'MEEN30010', 'MEEN30020', 'MEEN30140', 'MEEN30130', 'MEEN30160']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electr

In [17]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Biomed-Elec_UG_Modules.xlsx", foldername="BiomedElectricalPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'BMOL10030', 'BSEN10010', 'EEEN20010', 'EEEN20020', 'MATH20290', 'MEEN20010', 'PHYS20040', 'EEEN20030', 'MEEN20030', 'MEEN20040', 'MEEN20070', 'STAT20060', 'EEEN20040', 'MEEN20020', 'ACM30030', 'ANAT20090', 'EEEN30160', 'EEEN30150', 'EEEN30180', 'MEEN30160', 'EEEN30020', 'EEEN30110', 'EEEN30030', 'EEEN30050', 'EEEN30120', 'MEEN30140']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Elect

In [18]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Biomed-Mech_UG_Modules.xlsx", foldername="BiomedMechanicalPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'BMOL10030', 'BSEN10010', 'EEEN20010', 'EEEN20020', 'MATH20290', 'MEEN20010', 'PHYS20040', 'EEEN20030', 'MEEN20030', 'MEEN20040', 'MEEN20070', 'STAT20060', 'MEEN20060', 'MEEN20050', 'ACM30030', 'ANAT20090', 'EEEN30160', 'EEEN30150', 'EEEN30180', 'MEEN30160', 'MEEN20020', 'MEEN30090', 'MEEN30010', 'MEEN30020', 'MEEN30140', 'MEEN30030']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Elect

In [19]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Chemical_UG_Modules.xlsx", foldername="ChemicalPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'DSCY10060', 'COMP10060', 'CHEN20020', 'CHEN20030', 'CHEN20050', 'CHEN20080', 'MATH20290', 'CHEM20060', 'CHEN20060', 'CHEN20070', 'CHEM20070', 'CHEN20090', 'CHEN10010', 'ACM30030', 'CHEN30010', 'CHEN30020', 'CHEN30030', 'CHEN30040', 'CHEN30130', 'CHEN30200', 'CHEN30210', 'CHEN40790', 'MEEN30140', 'CHEN40150', 'CHEN40160', 'CHEN40210', 'CHEN40620', 'CHEN40570']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the Sc

In [20]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Civil_UG_Modules.xlsx", foldername="CivilPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'DSCY10070', 'CVEN10060', 'CVEN20030', 'CVEN20080', 'CVEN20110', 'CVEN20130', 'MATH20290', 'CVEN20010', 'CVEN20070', 'CVEN20120', 'CVEN20140', 'STAT20060', 'COMP20080', 'CVEN20040', 'ACM30030', 'CVEN30020', 'CVEN30040', 'CVEN30090', 'CVEN30390', 'CVEN30010', 'CVEN30060', 'CVEN30170', 'CVEN30400', 'GEOL30070', 'BSEN30240', 'MEEN30140']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Elect

In [21]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_CPEP_UG_Modules.xlsx", \
           foldername="CityPlanningAndEnvironmentalPolicyPath")

['AESC10010', 'ENVP10030', 'LARC10110', 'PLAN10010', 'PLAN10020', 'PLAN10030', 'ENVP10010', 'PLAN10040', 'PLAN10080', 'SSJ10060', 'GEOG10100', 'ENVP20020', 'PLAN20020', 'PLAN20070', 'PLAN20090', 'PLAN20010', 'PLAN20030', 'PLAN20040', 'PLAN20080', 'ECON20060', 'DEV20130', 'ENVP30010', 'ENVP30030', 'PLAN30010', 'PLAN30020', 'PLAN30040', 'PLAN30030', 'PLAN30060', 'PLAN30080', 'PLAN30150', 'GEOG30860', 'SPOL30220']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electronic Eng at SEEE Module places by Term 22 23.csv
...

In [22]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_LandArch_UG_Modules.xlsx", \
           foldername="LandscapeArchitecturePath")

['AESC10010', 'LARC10050', 'LARC10110', 'LARC10120', 'LARC10090', 'LARC10100', 'ENVP10010', 'ARCH10050', 'BIOL30020', 'HORT30050', 'LARC20150', 'AESC20070', 'LARC20160', 'LARC20170', 'PLAN10010', 'FOR20110', 'LARC30150', 'GEOG40770', 'PLAN30020', 'LARC30170', 'LARC30220', 'GEOG30860', 'LARC40390', 'LARC40420', 'LARC40360', 'LARC40540', 'PLAN30150', 'ARCT40660', 'LARC40550']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electronic Eng at SEEE Module places by Term 22 23.csv
.............................
 ['Agricult

In [23]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_StructEngArch_UG_Modules.xlsx", \
           foldername="StructuralEngineerWithArchitecturePath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'DSCY10070', 'CVEN10060', 'ARCT20040', 'CVEN20080', 'CVEN20110', 'MATH20290', 'CVEN20010', 'CVEN20040', 'CVEN20070', 'CVEN20120', 'CVEN20140', 'STAT20060', 'CVEN20030', 'CVEN20130', 'ACM30030', 'ARCT30030', 'CVEN30020', 'CVEN30040', 'CVEN30090', 'CVEN30390', 'CVEN30010', 'CVEN30170', 'CVEN30400', 'MEEN30130', 'CVEN30060', 'MEEN30140']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Elect

### Collecting Data by Standard Integrated Masters Path

In [24]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Biomed-Elec_ME_Modules.xlsx", \
           foldername="BiomedElectronicMastersPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'BMOL10030', 'BSEN10010', 'EEEN20010', 'EEEN20020', 'MATH20290', 'MEEN20010', 'PHYS20040', 'EEEN20030', 'MEEN20030', 'MEEN20040', 'MEEN20070', 'STAT20060', 'EEEN20040', 'MEEN20020', 'ACM30030', 'ANAT20090', 'EEEN30160', 'EEEN30150', 'EEEN30180', 'MEEN30160', 'EEEN30020', 'EEEN30110', 'EEEN30030', 'EEEN30050', 'EEEN30120', 'MEEN30140', 'EEEN40660', 'MEEN40600', 'MEEN40620', 'MEEN40630', 'EEEN40010', 'EEEN40580', 'EEEN40170', 'EEEN40220', 'MEEN40560', 'CHEN40470', 'EEEN40070', 'EEEN40350', 'COMP41670', 'COMP47460', 'EEEN40130', 'MEEN40160']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems

In [25]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Biomed-Mech_ME_Modules.xlsx", \
           foldername="BiomedMechanicalMastersPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'BMOL10030', 'BSEN10010', 'EEEN20010', 'EEEN20020', 'MATH20290', 'MEEN20010', 'PHYS20040', 'EEEN20030', 'MEEN20030', 'MEEN20040', 'MEEN20070', 'STAT20060', 'MEEN20060', 'MEEN20050', 'ACM30030', 'ANAT20090', 'EEEN30160', 'EEEN30150', 'EEEN30180', 'MEEN30160', 'MEEN20020', 'MEEN30090', 'MEEN30010', 'MEEN30020', 'MEEN30140', 'MEEN30030', 'EEEN40660', 'MEEN40600', 'MEEN40620', 'MEEN40630', 'MEEN40020', 'EEEN40580', 'EEEN40170', 'EEEN40220', 'MEEN40560', 'CHEN40470', 'MEEN41160', 'EEEN40350', 'MEEN40030', 'MEEN40170', 'MEEN40160', 'EEEN40730']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems

In [26]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Electrical_ME_Modules.xlsx", \
           foldername="ElectricalMastersPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'COMP10060', 'EEEN10020', 'EEEN20010', 'EEEN20020', 'EEEN20050', 'EEEN20070', 'MATH20290', 'EEEN20030', 'EEEN20040', 'EEEN20060', 'EEEN20090', 'STAT20060', 'SCI20020', 'COMP20200', 'ACM30030', 'COMP20080', 'EEEN30020', 'EEEN30110', 'EEEN30030', 'EEEN30050', 'EEEN30120', 'EEEN30150', 'EEEN30090', 'EEEN30070', 'MEEN30100', 'MEEN30140', 'EEEN40010', 'EEEN40080', 'EEEN40110', 'EEEN40550', 'MEEN40090', 'EEEN40310', 'EEEN40190', 'EEEN40260', 'EEEN40100', 'EEEN40090', 'EEEN40120', 'MEEN40430', 'EEEN40580', 'ACM40290', 'COMP47670']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Enginee

In [27]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Electronic_ME_Modules.xlsx", \
           foldername="ElectronicMastersPath")

['CHEM10030', 'CHEN10040', 'COMP10060', 'CVEN10040', 'EEEN10010', 'EEEN10020', 'MATH10250', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10150', 'PHYC10160', 'COMP20200', 'EEEN20010', 'EEEN20020', 'EEEN20030', 'EEEN20040', 'EEEN20050', 'EEEN20060', 'EEEN20070', 'EEEN20090', 'MATH20290', 'SCI20020', 'STAT20060', 'ACM30030', 'COMP20080', 'COMP20180', 'EEEN30020', 'EEEN30030', 'EEEN30050', 'EEEN30060', 'EEEN30110', 'EEEN30120', 'EEEN30150', 'EEEN30160', 'EEEN30190', 'COMP41670', 'EEEN40050', 'EEEN40060', 'EEEN40130', 'EEEN40150', 'EEEN40570', 'EEEN40210', 'EEEN40240', 'EEEN40010', 'EEEN40580', 'MEEN40430', 'EEEN40310', 'EEEN40720', 'COMP47670', 'EEEN40690']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engine

In [28]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Mechanical_ME_Modules.xlsx", \
           foldername="MechanicalMastersPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'COMP10060', 'DSCY10060', 'EEEN20020', 'MATH20290', 'MEEN20010', 'MEEN20020', 'MEEN20050', 'MEEN20030', 'MEEN20040', 'MEEN20060', 'MEEN20070', 'STAT20060', 'COMP20080', 'SCI20020', 'ACM30030', 'MEEN30030', 'MEEN30040', 'MEEN30090', 'MEEN30100', 'EEEN20090', 'EEEN30150', 'MEEN30010', 'MEEN30020', 'MEEN30140', 'MEEN30130', 'MEEN30160', 'MEEN40010', 'MEEN40020', 'MEEN40030', 'MEEN40050', 'MEEN40060', 'MEEN40150', 'MEEN40720', 'MEEN40700', 'MEEN40560', 'MEEN40170', 'MEEN40190', 'EEEN40010', 'MEEN40430', 'MEEN40090', 'MEEN40670']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engine

In [29]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_StructEngArch_ME_Modules.xlsx", \
           foldername="StructuralEngineerWithArchitectureMastersPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'DSCY10070', 'CVEN10060', 'ARCT20040', 'CVEN20080', 'CVEN20110', 'MATH20290', 'CVEN20010', 'CVEN20040', 'CVEN20070', 'CVEN20120', 'CVEN20140', 'STAT20060', 'CVEN20030', 'CVEN20130', 'ACM30030', 'ARCT30030', 'CVEN30020', 'CVEN30040', 'CVEN30090', 'CVEN30390', 'CVEN30010', 'CVEN30170', 'CVEN30400', 'MEEN30130', 'CVEN30060', 'MEEN30140', 'ARCT40030', 'CVEN40390', 'CVEN40550', 'CVEN40610', 'CVEN40720', 'CVEN40780', 'CVEN40730', 'CVEN40750', 'CVEN40760', 'CVEN40770', 'STAT40690', 'ARCT40870', 'CVEN40050', 'CVEN40120', 'MEEN40430']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engin

In [30]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Civil_ME_Modules.xlsx", \
           foldername="CivilMastersPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'DSCY10070', 'CVEN10060', 'CVEN20030', 'CVEN20080', 'CVEN20110', 'CVEN20130', 'MATH20290', 'CVEN20010', 'CVEN20070', 'CVEN20120', 'CVEN20140', 'STAT20060', 'COMP20080', 'CVEN20040', 'ACM30030', 'CVEN30020', 'CVEN30040', 'CVEN30090', 'CVEN30390', 'CVEN30010', 'CVEN30060', 'CVEN30170', 'CVEN30400', 'GEOL30070', 'BSEN30240', 'MEEN30140', 'CVEN30110', 'CVEN40390', 'CVEN40690', 'CVEN40720', 'CVEN40780', 'CVEN40830', 'CVEN40130', 'CVEN40750', 'CVEN40760', 'STAT40690', 'CVEN40710', 'MEEN40430', 'CVEN40550', 'CHEN40010', 'CVEN40060']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engin

In [31]:
module_assessments, module_descriptors, error_modules, error_module_descriptors\
=collector(filename="UCD_EngArch_Path_Chemical_ME_Modules.xlsx", \
           foldername="ChemicalEngineeringMastersPath")

['CHEM10030', 'CHEN10040', 'CVEN10040', 'EEEN10010', 'MATH10250', 'PHYC10150', 'MATH10260', 'MEEN10030', 'MEEN10050', 'PHYC10160', 'DSCY10060', 'COMP10060', 'CHEN20020', 'CHEN20030', 'CHEN20050', 'CHEN20080', 'MATH20290', 'CHEM20060', 'CHEN20060', 'CHEN20070', 'CHEM20070', 'CHEN20090', 'CHEN10010', 'ACM30030', 'CHEN30010', 'CHEN30020', 'CHEN30030', 'CHEN30040', 'CHEN30130', 'CHEN30200', 'CHEN30210', 'CHEN40790', 'MEEN30140', 'CHEN40150', 'CHEN40160', 'CHEN40210', 'CHEN40620', 'CHEN40570', 'CHEN40700', 'CHEN40740', 'CHEN40750', 'CHEN40590', 'CHEN40010', 'CHEN40560', 'CHEN40610', 'CHEN40130', 'CHEN40440', 'CHEN40460']
Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22

### Collecting Data by Level

In [32]:
#Run the collector function to only collect level 1 modules in the college of Engineering and Architecture
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(level=1)

Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electronic Eng at SEEE Module places by Term 22 23.csv
..................................................................................................................................................................................................................................................................................................................................................................................................................................

In [33]:
#Run the collector function to only collect level 1 modules in the college of Engineering and Architecture
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(level=2)

Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electronic Eng at SEEE Module places by Term 22 23.csv
..................................................................................................................................................................................................................................................................................................................................................................................................................................

In [34]:
#Run the collector function to only collect level 1 modules in the college of Engineering and Architecture
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(level=3)

Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electronic Eng at SEEE Module places by Term 22 23.csv
..................................................................................................................................................................................................................................................................................................................................................................................................................................

In [35]:
#Run the collector function to only collect level 1 modules in the college of Engineering and Architecture
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(level=4)

Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electronic Eng at SEEE Module places by Term 22 23.csv
.......................................................................................................................
ERROR MODULE DETECTED.
Module may need to be inspected, saving information as an error module and continuing without it
..........................................................................................................................
ERROR MODULE DETECTED.
Module may need to be inspected,

In [36]:
#Run the collector function to only collect level 1 modules in the college of Engineering and Architecture
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(level=5)

Getting file on the School of Civil Engineering at SCE Module places by Term 22 23.csv
Getting file on the School of Mechanical & Materials Eng at SMME Module places by Term 22 23.csv
Getting file on the School of Chem & Bioprocess Engineering at SCBE Module places by Term 22 23.csv
Getting file on the School of Biosystems & Food Engineering at SBFE Module places by Term 22 23.csv
Getting file on the School of Architecture, Plan & Env Pol at SAPEP Module places by Term 22 23.csv
Getting file on the School of Electrical & Electronic Eng at SEEE Module places by Term 22 23.csv
..................................................................................................................................................................................................................................................................................................................................................................................................................................

### Collecting Data by School

In [37]:
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(school="Mechanical & Materials Eng")

    Unnamed: 0                               0
0    DSCY10070            Materials in Society
1    MEEN10030         Mechanics for Engineers
2    MEEN10050              Energy Engineering
3    MEEN20010           Mechanics of Fluids I
4    MEEN20020     Manufacturing Engineering I
..         ...                             ...
103  MEEN50050  Creative Thinking & Innovation
104  MEEN50060   Research Techniques Space Eng
105  MEEN50070           Industrial Research I
106  MEEN50080          Industrial Research II
107  MEEN50090         Industrial Research III

[108 rows x 2 columns]
....................................
ERROR MODULE DETECTED.
Module may need to be inspected, saving information as an error module and continuing without it
..................
ERROR MODULE DETECTED.
Module may need to be inspected, saving information as an error module and continuing without it
........................
ERROR MODULE DETECTED.
Module may need to be inspected, saving information as an error modu

In [38]:
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(school="Civil Engineering")

   Unnamed: 0                                                  0
0   CVEN10040                               Creativity in Design
1   CVEN10050                         Intro to Civil & Envir Eng
2   CVEN10060   The Engineering and Architecture of Structures 1
3   CVEN20010                                Mechanics of Solids
4   CVEN20030             Environmental Engineering Fundamentals
5   CVEN20040     The Engineering & Architecture of Structures 2
6   CVEN20070         Computer Applications in Civil Engineering
7   CVEN20080                             Construction Materials
8   CVEN20110                                      Geotechnics 1
9   CVEN20120                              Construction Practice
10  CVEN20130                                       Hydraulics I
11  CVEN20140                          Design and Communications
12  CVEN30010                                      Geotechnics 2
13  CVEN30020                           Analysis of Structures 1
14  CVEN30040            

In [39]:
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(school="Chem & Bioprocess Engineering")

   Unnamed: 0                                               0
0   CHEN10010         Chemical Engineering Process Principles
1   CHEN10040                        Intro. to Eng. Computing
2   CHEN20020   Chemical & Bioprocess Engineering Measurement
3   CHEN20030  Chemical Engineering Thermodynamics & Kinetics
4   CHEN20050               Bioprocess Engineering Principles
..        ...                                             ...
67  CHEN40870                  GMP Manufact Advan Therapeutic
68  CHEN50050                      Research Seminar Series II
69  CHEN50060                   Industrial Research Placement
70  CHEN50070                  Industrial Research Internship
71  CHEN50080                       Research Seminar Series I

[72 rows x 2 columns]
..................................................................
ERROR MODULE DETECTED.
Module may need to be inspected, saving information as an error module and continuing without it
.
ERROR MODULE DETECTED.
Module may need to be

In [40]:
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(school="Biosystems & Food Engineering")

   Unnamed: 0                                        0
0   BSEN10010  Biosystems Engineering Design Challenge
1   BSEN10020              How Sustainable is My Food?
2   BSEN20010                Engineering and Surveying
3   BSEN20040   Biosystems Engineering Research Trends
4   BSEN20060                             Food Physics
..        ...                                      ...
66  BSEN40790                  Carbon & Sustainability
67  BSEN40800                         Industry Project
68  BSEN40810                           GHG Accounting
69  BSEN40820                      Carbon Footprinting
70  BSEN40830              Scientific Writing Workshop

[71 rows x 2 columns]
......................................................
ERROR MODULE DETECTED.
Module may need to be inspected, saving information as an error module and continuing without it
.................
 ['Biosystems & Food Engineering']

FINISHED, SCRAPED DETAILS ON 70 MODULES, OVER 1 SCHOOLS AND 4 LEVELS
saving to ModuleInf

In [41]:
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(school="Architecture, Plan & Env Pol")

    Unnamed: 0                                                  0
0    ARCT10010                             Architectural Design I
1    ARCT10020                            Architectural Design II
2    ARCT10030                     Architecture & its Environment
3    ARCT10040  Architectural Technologies I – Introduction to...
4    ARCT10070  Survey I: History and Theory of the Built Envi...
..         ...                                                ...
161  PLAN40390                                 Sustainable Cities
162  PLAN40420                     Planning Pract Internshp 20crs
163  PLAN40430                     Int Specialist Studies 1(Chic)
164  PLAN40440                     Int Specialist Studies 2(Chic)
165  PLAN40540                                       Planning Law

[166 rows x 2 columns]
....................................
ERROR MODULE DETECTED.
Module may need to be inspected, saving information as an error module and continuing without it
.
ERROR MODULE DETECTED.
Mo

In [42]:
module_assessments, module_descriptors, error_modules, error_module_descriptors=collector(school="Electrical & Electronic Eng")

   Unnamed: 0                                        0
0   DSCY10060          Energy, Climate Change & Policy
1   EEEN10010  Electronic and Electrical Engineering I
2   EEEN10020                  Robotics Design Project
3   EEEN20010                     Computer Engineering
4   EEEN20020       Electrical and Electronic Circuits
..        ...                                      ...
65  EEEN50070                    Industrial Research I
66  EEEN50080                   Industrial Research II
67  EEEN50090                 Industrial Research III 
68  EEEN50100  Stability Analysis of Nonlinear Systems
69  EEEN50110                Power System Optimisation

[70 rows x 2 columns]
......................................................................
ERROR MODULE DETECTED.
Module may need to be inspected, saving information as an error module and continuing without it

 ['Electrical & Electronic Eng']

FINISHED, SCRAPED DETAILS ON 69 MODULES, OVER 1 SCHOOLS AND 5 LEVELS
saving to ModuleInform

## Inspecting Error Modules
As stated, the collector identifies any error modules and saves them for later visual inspection. We will just quickly inspect these modules, and make sure that they are not either a sign of a greater issue, or should be somehow included.

In [43]:
#Inspect the error modules just in case. Simply loop through their assessment and description data and print it for inspection
for i, error in enumerate(zip(ALL_error_modules, ALL_error_module_descriptors)):
    print("********ERROR COUNT %d*********" %i)
    print(error[0])
    print(error[1])
    

********ERROR COUNT 0*********
         Description             Timing     Open Book Exam    Component Scale  \
0  Not yet recorded.  Not yet recorded.  Not yet recorded.  Not yet recorded.   

  Must Pass Component   % of Final Grade    Assessment Type Module Code  
0   Not yet recorded.  Not yet recorded.  Not yet recorded.   CVEN40790  
Subject:                                         Civil Engineering
College:                                Engineering & Architecture
School:                                          Civil Engineering
Level:                                                 4 (Masters)
Credits:                                                      30.0
Credit Split by Trimester:               Autumn 1Spring 2Summer 27
Trimester:                                   Year-long (12 months)
Module Coordinator:                Assoc Professor Arturo Gonzalez
Mode of Delivery:                                 Not yet recorded
Internship Module:                                     