Survey of software use at the University of Southampton

DOI: 10.5281/zenodo.3569549

In June 2019 we conducted a survey of software use across 6355 academic staff and PhD students. The survey was open for two weeks and collected 603 responses.

The raw data was cleaned using Open Refine to remove email addresses, for privacy reasons, to remove responses that were not valid (namely responses that were not associated with a known faculty at the University of Southampton) and to reduce the job title provided by the respondents into a set of known job titles (e.g. convert "Prof", "Professor", "Proffessor" [sic] to "Professor"). The result of this cleaning is the file data/Cleaning-of-Uni-Soton-Software-Survey-26Jun19.csv.

Results

If you want quick access to the results, take a look at the report.

Charts of the univariate analysis can also be seen in this simple presentation.

Important points

Licence for the code, data, reports and charts can be found in the the LICENCE, DATA LICENCE and REPORT LICENCE files respectively.
The code runs on Python 3.

Running the analysis

Get the files and data: Clone the git repository
We suggest the use of a virtual environment. The file requirements.txt can be used to load the necessary libraries.
Run the analysis script analyse_survey.py.

Note that the file column_name_renaming.py contains instructions for shortening the column names (using the full question for the column name gets tedious) and lists which questions are sorted in which way (some questions are best suited to the results being sorted by the size of response, others - like the scale questions that rank responses from 1 to 5 - require the results to be sorted in specific order (i.e. 1 to 5).

Bivariate analysis is controlled by the file bivariate_instructions.py. It's a dictionary called which_by_which. The values represent questions of interest and the key represents the question by which you wish to segment the questions of interest. For example, if you want to investigate how the number of people who develop code varies by faculty, you would set up the dictionary found in the bivariate_instructions.py file as follows:

which_by_which = {'faculty': ['develop_own_code']}

if you also wanted to investigate how the training question segemented by faculty, you would use a dictionary:

which_by_which = {'faculty': ['develop_own_code', 'training']}

The separate bivariate files (found in output_csv/bivariate) are brought together into summary csvs by the script combine_bivariate_results_for_graphing.py to produce csvs the csvs found in output_csv/bivariate/summaries

Files and directories

analyse_survey.py: the main analysis script that converts the survey data into csvs that each summarise a question.
column_name_renaming.py: lookup file used for shortening names of columns of data.
bivariate_instructions.py: lookup file used to instruct analyse_survey.py on which bivariate analyses to conduct.
combine_bivariate_results_for_graphing.py: combines the individual bivariate csv files to produce useful summaries.
UniSotonSoftwareSurvey_June2019.pdf: a pdf file of the original survey used to collect the data
data/Cleaning-of-Uni-Soton-Software-Survey-26Jun19.csv: an anonymised version of the survey results
output/csv/: all output csvs are stored in this directory and the enclosed directories.
report/Results of University of Southampton software survey June 2019.ipynb: Jupyter notebook used to write report
report/Results of University of Southampton software survey June 2019.pdf: pdf of Jupyter notebook
charts/: charts of all the output csvs as png images
charts/plot_details/: csvs holding parameters used to draw charts

Plotting

You can plot the csv files using any graphing program of your choice. Personally, I use a graphing program I wrote in Python to make the results look pretty. Feel free to use it too (made easier if you use the pre-existing parameters in the csvs held in report/charts/plot_details/.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
data		data
output_csv		output_csv
report		report
.gitignore		.gitignore
CITATION.md		CITATION.md
DATA LICENCE.md		DATA LICENCE.md
LICENSE		LICENSE
README.md		README.md
REPORT LICENCE.md		REPORT LICENCE.md
UniSotonSoftwareSurvey_June2019.pdf		UniSotonSoftwareSurvey_June2019.pdf
analyse_survey.py		analyse_survey.py
bivariate_instructions.py		bivariate_instructions.py
column_name_renaming.py		column_name_renaming.py
combine_bivariate_results_for_graphing.py		combine_bivariate_results_for_graphing.py
requirements.txt		requirements.txt

License

Southampton-RSG/soton_software_survey_analysis_2019

Folders and files

Latest commit

History

Repository files navigation

Survey of software use at the University of Southampton

Results

Important points

Running the analysis

Files and directories

Plotting

About

Resources

License

Stars

Watchers

Forks

Languages