Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit f8d8233
Showing
83 changed files
with
72,214 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
*.pyc | ||
.idea/ | ||
*.db | ||
*.psd | ||
*txt | ||
*.ipynb | ||
frontend/config.py | ||
dump.rdb | ||
db_create.py | ||
migrations/ | ||
userData/ | ||
logs/ | ||
dev/ | ||
failedAnalyses/ | ||
article/ | ||
frontend/config.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
Science Flask - A web-app template for online scientific research tools. | ||
Copyright (C) 2017 Daniel Homola | ||
|
||
This program is free software: you can redistribute it and/or modify | ||
it under the terms of the GNU General Public License as published by | ||
the Free Software Foundation, either version 3 of the License, or | ||
(at your option) any later version. | ||
|
||
This program is distributed in the hope that it will be useful, | ||
but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
GNU General Public License for more details. | ||
|
||
You should have received a copy of the GNU General Public License | ||
along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,318 @@ | ||
<p align="center"> | ||
<img src="https://github.com/danielhomola/science_flask/blob/master/frontend/static/img/logo_medium.png?raw=true" alt="Science Flask"/> | ||
</p> | ||
|
||
# Science Flask | ||
|
||
Science Flask is a web-app template for online scientific research tools built | ||
with Python Flask, JavaScript and Bootstrap CSS. | ||
|
||
There are two main reasons for creating this template: | ||
|
||
1. Unfortunately that there's quite | ||
a lot of horribly looking and barely functioning scientific tools online. | ||
The reason behind this is simple: we are scientists and not | ||
web-developers. Keeping up with the latest trends, learning CSS, HTML | ||
and modern web dev practices is simply not in our job description | ||
and certainly not in our salary or grant money. So most of the time we | ||
just want the tool to be online, in almost any shape or form. This, | ||
however leads to two things: | ||
- Many amazing projects simply are not made available online, as the | ||
authors didn't have time/know-how/energy/funding for doing that. | ||
- Many projects that do end up getting some form of online presence | ||
look hideo4us and are not very user-friendly. | ||
|
||
2. I realised that there's an incredible amount of overhead involved with | ||
creating a fully functioning website, with user management, file uploads, | ||
form validation, etc. You can learn Flask in about a weekend to a level | ||
that you can understand code examples and start hacking away. But it | ||
takes a __long__ time to actually get a working website with all the | ||
features listed above. | ||
|
||
I hope that by open-sourcing Science Flask, the activation energy for | ||
turning an offline science project or research tool into a functioning | ||
web-app will be lowered significantly. Hopefully this will not only mean | ||
there'll be more science tools published, but also they'll look and feel | ||
nicer and will be easier to use. | ||
|
||
## Features | ||
|
||
So why shouldn't you just start hacking away, following Flask tutorials? | ||
What exactly can Science Flask give you? | ||
|
||
- __User management:__ User's are only allowed to register with a | ||
valid academic email address. This is to ensure that your tool is mainly | ||
used for academic and research purposes and not for commercial uses. | ||
Furthermore it comes with all the rest of it: email addresses are | ||
confirmed, users can change passwords, get password reset request if they | ||
forgot it, etc. Thanks Flask-Security, you can also assign roles to different | ||
users and easily build custom user management logic. For example you might | ||
decide that certain users can only use a part of the application, while other | ||
users can access all features. | ||
- __SQL database__: All user, study and analysis data is stored in an | ||
SQLite by defaul. This can be changed to MySQL or Postgre SQL easily and | ||
the same code will work, thanks to `SQLAlchemy`. Thanks to Flask-Migrate if you | ||
change your app's model, you can easily upgrade your database even when your | ||
app is deployed. | ||
- __Admin panel:__ The model (database tables and relations between them) | ||
of your app can be easily edited online, from anywhere using CRUD operations. | ||
Thanks to Flask-Admin, setting up an admin user who can edit users, and other | ||
databases is as simple as modifying 2 lines in the config file. | ||
- __Upload form:__ Getting the data from the user sounds super simple but | ||
you'd be surprised how long does it take to get a decent upload page. | ||
Also it's very easy to build complex form logic from the bricks Science- | ||
Flask provides. | ||
- __Profile page__: This collects the uploaded studies of each user and | ||
let's them submit analysis on their data. | ||
- __Analysis form__: Just like with the upload form, you can build custom | ||
logic to ensure you get the parameters from the user just right. The | ||
analysis job is then submitted to the backend. This uses `Celery`. Once | ||
the analysis is ready, the user is notified in email. Then they can | ||
download or check out their results online. | ||
- __Logging__: All errors and warning messages are sent to the admins via | ||
email. All analysis exceptions and errors could be catched so that the | ||
program crashes gracefully, letting the user know what happened. | ||
- __Runs on Bootstrap.css__: Modern, mobile friendly, responsive. | ||
Bootstrap makes writing good looking HTML pages dead easy. | ||
- __Tool tips and tours__: Explain to the user how your application works with | ||
interactive tours (available on all the above listed pages) and tooltips. | ||
- __Python3__: The whole project is written in Python3.5 (because it's 2017). | ||
|
||
## Example app - [available here](https://ec2-52-215-3-198.eu-west-1.compute.amazonaws.com/) | ||
|
||
In it's current form Science Flask implements a really simple scientific app. | ||
1. Users can register with an academic email address. | ||
2. Upload one or two datasets as .csv or .txt files. | ||
3. A series of checks are performed on the uploaded datasets: | ||
- all columns have to be numerical | ||
- each dataset must have a feature and sample number that is between a | ||
predefined (see config.py) minium and maximum | ||
- if we have two datasets uploaded by the user, they need a minimum number | ||
of intersecting samples. | ||
- missing values are imputed with their column-wise median | ||
4. Then the user can submit an analysis and select the number of columns with | ||
highest variance, that will be selected from each dataset. | ||
5. These features are used to calculate the a correlation matrix between them. | ||
6. If there's only one dataset uploaded, the correlations are calculated | ||
between the features of this one dataset. If two datasets are uploaded then | ||
three matrices/plots are produced: two for the features of the individual | ||
datasets and another that shows the correlation between the features of the | ||
two disperate datasets. | ||
7. The resulting p-values of the correlation matrix are filtered using one of | ||
the user selected corrrection for multiple testing methods: Bonferroni or | ||
Benjamini Hochberg. The user can also specify the the alpha-level for | ||
hypothesis testing. Only correlations that pass both of these will be displayed. | ||
8. The tables and heatmap of correlations can be downloaded by the user or | ||
checked online. | ||
|
||
The app runs on AWS on a t2.micro instance. The [__deployment.md__](https://github.com/danielhomola/science_flask/blob/master/deployment.md) | ||
explains the necessary steps to get your app to this stage. | ||
|
||
## Installation | ||
|
||
Here's how to get Science Flask working in 2 minutes on Linux. It should be | ||
fairly similar on OSX. Unfortunatly I don't have any experience with Python | ||
development on Windows, so please let me know if you figured out how to get | ||
it to work a Win machine. | ||
|
||
Clone the repo | ||
``` | ||
git clone https://github.com/danielhomola/science_flask.git | ||
cd science_flask | ||
``` | ||
|
||
Make a Python 3 virtual environment using virtalenv or virtualenvwrapper. If these | ||
are foreign concepts to you, have a look [here](http://python-guide-pt-br.readthedocs.io/en/latest/dev/virtualenvs/). | ||
|
||
``` | ||
mkvirtualenv --python=/usr/bin/python3 science_flask | ||
workon science_flask | ||
``` | ||
|
||
Install all the required Python packages into your virtual environment. Also | ||
install RabbitMQ, [here's how](https://www.rabbitmq.com/download.html) if you're not on Ubuntu. | ||
``` | ||
pip install -r requirements.txt | ||
sudo apt-get install rabbitmq-server | ||
``` | ||
|
||
Customize `frontend/config_example.py` and rename it to `frontend/config.py` | ||
1. Generate a secret key for your app like [this](https://gist.github.com/geoffalday/2021517) | ||
2. Setup the username, email, password for the admin. You can then log in with | ||
these credentials and go to the Admin profile from the Profile page. Then you | ||
can edit all the tables of the database from online. | ||
3. Setup mail sending. I used [AWS's SES service](http://docs.aws.amazon.com/ses/latest/DeveloperGuide/quick-start.html), | ||
but you can use [Gmail](http://stackoverflow.com/questions/37058567/configure-flask-mail-to-use-gmail) also. | ||
|
||
|
||
Create the SQLite database of the app and add the admin user. | ||
``` | ||
python db_create.py | ||
``` | ||
|
||
If you'd like to use Alembic to migrate your database if you update it's schema, | ||
then read [this blog post](https://blog.miguelgrinberg.com/post/flask-migrate-alembic-database-migration-wrapper-for-flask) | ||
and the docs [here](https://blog.miguelgrinberg.com/post/flask-migrate-alembic-database-migration-wrapper-for-flask) and do: | ||
|
||
``` | ||
./manage.py db init | ||
./manage.py db migrate | ||
./manage.py db upgrade | ||
``` | ||
|
||
Open up two terminals. In one of them we'll run the Flask app. Edit the `manage.py` | ||
file and edit its 1st line so it points to your VM interpreter. | ||
``` | ||
# make it executible - only need to do this once | ||
chmod +x manage.py | ||
./manage.py runserver | ||
``` | ||
|
||
In the second terminal window start the Celery worker: | ||
``` | ||
celery worker -A frontend.analysis.celery --loglevel=info | ||
``` | ||
|
||
Create three folders for the log files, uploaded files and for the failed analyses. | ||
``` | ||
mkdir userData | ||
mkdir failedAnalyses | ||
mkdir logs | ||
``` | ||
|
||
Now go to Chrome and type in: http://0.0.0.0:5055/ and voila you have | ||
Science Flask running on your computer. You should be able to do everything that | ||
you can do on the online demo. Time to modify it to your needs. | ||
|
||
|
||
## Deployment to the cloud (AWS) | ||
|
||
See [__deployment.md__](https://github.com/danielhomola/science_flask/blob/master/deployment.md). | ||
|
||
## What do you need to get started with web-app development? | ||
|
||
Science Flask - as the name suggests - is built on Flask which is a micro | ||
web-framework written in PYthon. Don't panic. The actual sciency bit of your | ||
project can be in any language it doesn't have to be Python. | ||
|
||
You'll need to be familiar with Flask and application development in Flask at | ||
least a bit, if you want to use this template. The good news is though, you | ||
really don't need to be a Flask ninja, and getting to sufficient level will | ||
not take longer than an a weekend. | ||
|
||
Have a look at [this mega tutorial](https://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-i-hello-world), | ||
and make sure you read the first 5 chapters. You can pick up the rest as you | ||
go along. Btw, there are dozens of other tutorials on Flask and the community | ||
is really active and helpful so use and abuse Google and StackOverflow :) | ||
|
||
You'll also need a good editor. I'd definitely recommend PyCharm. There's a | ||
community edition which is completely free. It has amazing debugging, | ||
refactoring, and developing capabilities that will make your life so much | ||
easier. | ||
|
||
## Tech stack | ||
|
||
- The user facing part of the __frontend__ is written in HTML, Bootstrap CSS, | ||
JS and Bootstro.js. The templating is done with Jinja2 (default engine by Flask). | ||
|
||
- The website is running on __Flask__. This is serving the clients with the | ||
requested content (HTML, CSS, JS). It also validates the forms, writes | ||
and reads from the database via SQLAlchemy and does everything else | ||
you would expect from a web-framework to do. | ||
|
||
- The __upload__ uses a bit of AJAX. | ||
|
||
- The __backend__ can be in any language as long as __Celery__ can execute the | ||
job and you figure out a way how to do that. Doing the actual science bit of | ||
your app in Python however is probably the easiest. Celery will run the | ||
submitted analyses as a job queue, while your web-app can continue to | ||
serve HTTP requests (as it should). | ||
|
||
- __RabbitMQ__ is a message broker between the | ||
Flask web-app and Celery. This is basically a messenger that let's Celery | ||
know about any new submitted jobs, while it let's Flask now if any of | ||
the submitted jobs have finished running. | ||
|
||
- The models of the app connect to an __SQLite__ database by default but this | ||
could be changed easily. | ||
|
||
- The emails are sent through __Amazon's free SES servcice__, but this can be | ||
changed to any mail server. | ||
|
||
## Overall structure of Science Flask | ||
|
||
### frontend | ||
|
||
Frontend holds all the website components (HTML, CSS, JS) and Flask app that | ||
handles the views, the database models, the forms and their validation scripts. | ||
In the following description all folders are marked with bold letters. | ||
|
||
- __static__: holds all the content of the website that will not change | ||
- __css__: CSS files are kept here | ||
- main.css: most of the site's look is defined here | ||
- forms.css: some custom elements for the forms | ||
- bootstro.js: you can modifiy the look of the tour here | ||
- __fonts__: icons and special fonts live here | ||
- __img__: any images that you'd like to use (logos, figures, etc) | ||
- __js__: all JavaScript is sourced from here | ||
- analysis.js and upload.js: two main JavaScript files that you should | ||
look into | ||
- __uni__: holds the university domain database to check user emails | ||
- robots.txt, sitemap.xml: both needed so search-bots can do their job | ||
- __templates__: all the individual pages of the website are here | ||
- __security__: user registration and related pages | ||
- __utils__: standard error code pages and some custom error pages | ||
- \_\_init.py\_\_: setup/initialization and configuration of the Flask app | ||
- analysis.py: main script that is called when the user submits a new job | ||
- config.py: all configuration info of the Flask app is stored here, make | ||
sure to read it and set everything up properly. | ||
- forms.py: upload, analysis and registration forms with their validators | ||
- models.py: database models/tables for users, studies, analyses | ||
- views.py: implements the main logic of the individual pages, such as | ||
upload, analysis, profile | ||
- view_functions.py: additional functions that are called by views.py which | ||
are put here so that views.py doesn't get more bloated as it is already. | ||
|
||
### backend | ||
|
||
This is where you'd put the actual bits and pieces of your scientific tool that | ||
do the analysis. In the frontend, analysis.py will call functions your pipeline | ||
of functions from here to carry out what your app is advertised to be doing. | ||
|
||
- utils: some utility functions already live here that are called by the frontend | ||
- check_uploaded_files.py: this will check the uploaded files and make sure | ||
all the values in there are sensible and numerical. You can modify this to | ||
your liking/needs. | ||
- io_params.py: while an analysis/job is running a dictionary/hash of | ||
parameters are kept that holds all the information about the particular | ||
job and the user. This script reads those parameters in and returns them | ||
as a dictionary so analysis.py can use it. | ||
|
||
|
||
### userData | ||
|
||
Each user that registers have a folder. Each uploaded dataset is then placed | ||
in the given user's folder. Finally the results of an analysis are stored | ||
under the folder of the given study. This results in a hierarchy like this: | ||
|
||
- user1 | ||
- study1 | ||
- analysis1 | ||
- analysis2 | ||
- study2 | ||
- analysis1 | ||
- user2 | ||
- study1 | ||
- analysis1 | ||
- analysis2 | ||
- analysis3 | ||
- ... | ||
|
||
### failedAnalyses | ||
|
||
When an analysis fails, the state of the run (intermediate files and parameters | ||
prior to the bug) is all saved here. | ||
|
||
|
||
## Cite Science Flask | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
""" | ||
Methods for CorrMapper's analysis pipeline. | ||
Code by: Daniel Homola, Imperial College London, 2015 | ||
------------------------------------------------------------------------------ | ||
Copyright (c) 2015, Daniel Homola | ||
All rights reserved. | ||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
1. Redistributions of source code must retain the above copyright notice, this | ||
list of conditions and the following disclaimer. | ||
2. Redistributions in binary form must reproduce the above copyright notice, | ||
this list of conditions and the following disclaimer in the documentation | ||
and/or other materials provided with the distribution. | ||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND | ||
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | ||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR | ||
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES | ||
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND | ||
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS | ||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
""" | ||
|
||
__version__ = 1.0 | ||
__author__ = 'Daniel Homola' | ||
__copyright__ = "Copyright 2015" | ||
__credits__ = 'Daniel Homola' | ||
__license__ = "BSD" | ||
__maintainer__ = "Daniel Homola" | ||
__email__ = "dani.homola@gmail.com" | ||
__status__ = "Prototype" |
Empty file.
Oops, something went wrong.