A skeleton of the website intended to display interactive graphs statistically quantifying the gender gap in Wikipedia biographies, through several indicators.
The project was conceived by Max Klein (https://notconfusing.com) and is funded by Wikimedia organization through an individual engagement grant. Development is continued by a Max himself and a voluntarily assembled team of researchers, developers and designers.
Read the complete prototype research and preliminary results.
The website is based on popularly used Python based static site generator, Nikola. After post processing of Wikidata, graphs are generated using Bokeh, another Python based interactive visualization library targeting web browsers.
We currently intend to display four graphs: Gender by Culture, Gender by Country (World Map), Gender by Date of Birth, and Wikipedia Language by Gender.
To run the website locally, ensure that you have installed latest version of Nikola and Bokeh. The instructions can be found on their respective websites. Then run the following set of commands:
git clone https://github.com/hargup/WIGI-website
cd WIGI-website
nikola build && nikola serve
If everything goes fine, you should be able to see WIGI website at 127.0.0.1:8000.
All you need to know for running the WIGI website and playing with graphs is to
run nikola build && nikola serve
. If, however, you want to add more graphs or
play with new data, there are couple of things to note.
It all starts with the conf.py
file in the repository root directory. This
file is used to configure how Nikola behaves and how does it generate static
HTML pages from templates.
- All the posts are constructed from their specific templates, which file
metadata and instructions on how to render the specific html page. For
example,
gender by country.md
post has the following one line in the description:
.. template: gender_by_country.tmpl
This specifies the template to be used for creating the gender_by_country.html
file. The templates are located in templates/
directory.
- Templates instruct how to build webpage and where to embed Bokeh graph. For example, if you open
gender_by_country.tmpl
for example, you can find the following block
<%block name="plot">
${gender_by_country_plot}
</%block>
which embeds the plot data (through gender_by_country
plot) within the plot
block of the HTML page and renders it further.
- The interesting part, as to how Nikola templates receive the plot data, can
be answered by inspecting
conf.py
. Whennikola build
is run, firstconf.py
is executed. In this file, we import our Bokeh plot generating functions and generate respective plots' data. These data are then made available to all the Nikola templates by putting them intoGLOBAL_CONTEXT
.
GLOBAL_CONTEXT = {
"gender_by_country_plot": gender_by_country.plot(),
"gender_by_culture_plot": gender_by_culture.plot(),
"gender_by_dob_plot": gender_by_dob.plot(),
"language_by_gender_plot": language_by_gender.plot()
}
These variables were referenced in the respective template files (as explained in point 2) to embed the plot data.
All of this happens automatically by running nikola build
.
If you have a new plot to add, you need to add the following files:
- A Python script to generate the Bokeh plot data and import the function in
conf.py
. Place the script inplots/
directory and see any existing file to learn about what the function should do and return. - A template file
<graph>.tmpl
describing where you want to embed the plot data. - A markdown file
<post>.md
referencing the template in the description, and other data (text, commentary, citations etc.,) you want along with the post.
Please see any existing file for clear example. Once you are done, run nikola build && nikola serve
.
All the data used by Bokeh scripts can be found in data/
directory in
repository root. If you want to use new data, update the respective csv files
with suitable data. It is recommended to keep new data files in this directory
only.
Max Klein (@notconfusing), Vivek Rai (@vivekiitkgp), Harsh Gupta (@hargup)
All source code files and content are available under MIT License and content is available under a Creative Commons Attribution-ShareAlike 4.0 International License respectively.