Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/postgres database #523

Merged
merged 107 commits into from
Mar 22, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
107 commits
Select commit Hold shift + click to select a range
041a377
Basics of data importer for database change. Can import dbSNP.
Sep 8, 2018
6975d9f
Some changes to how arguments are passed.
Sep 8, 2018
bb36ec7
Adds reference set importer to importer, as well as bug fixes to dbsn…
Sep 10, 2018
9866099
Adds importer for old database data tables.
Sep 14, 2018
a585601
removes accidentally committed debugging code.
Sep 14, 2018
150a636
Adds support to move user fields.
Sep 15, 2018
e327cbb
Adds basic raw data importer.
Sep 20, 2018
e2faa76
Quick and dirty postgres
viklund Sep 20, 2018
eff70cd
Adds postgres schemas
Sep 20, 2018
4d5bdcb
fixes bug where gziped lines where not read in the same way as unzipp…
Oct 22, 2018
2bed837
adds dry-run support to dbsnp importer.
Oct 23, 2018
1bc9d38
adds mysql-data and downloaded files to gitignore.
Oct 23, 2018
55b138c
adds importer requirements.
Oct 23, 2018
c228a88
Adds dry run support to reference set import.
Oct 23, 2018
7016c5f
updates to old db importer to support auto-picking refset and dry-runs.
Oct 26, 2018
ecc2e8e
Corrects a help message.
Oct 26, 2018
6903360
adds --dry-run, --dataset, and --version flags to raw data importer.
Oct 29, 2018
40f3029
Adds number of variants to dataset_versions
Nov 8, 2018
81795b6
Updates importer to match updated postgres schema.
Nov 14, 2018
fce1a30
Adds functions from the old db peewee wrapper to the new one.
Nov 21, 2018
f97f247
Docker-compose for postgres database.
Nov 21, 2018
ca939bc
Main site working from postgres database (not browser).
Dec 4, 2018
f860f06
Fix the name of the end_pos column
talavis Jan 4, 2019
e4102b9
changed database connection type; playhouse.postgres_ext requires Pos…
talavis Jan 7, 2019
4b3ceda
Should hopefully give us actual json in the db
talavis Jan 21, 2019
03b7396
gencode files are gtf, ie 1-indexed
talavis Jan 21, 2019
ea0123c
Should fix gene stop positions
talavis Jan 21, 2019
a296220
Should fix the division of multiple names
talavis Jan 21, 2019
73de6e2
should fix dbsnp problem
talavis Jan 21, 2019
40c6420
quality metrics should also remain as dict
talavis Jan 24, 2019
6b0b692
other_names for a gene is now in a separate table
talavis Feb 8, 2019
f025a72
Updates db handler to match array-to-table changes in schema.
Feb 1, 2019
280c7c3
fixed a confusingly named field (gene -> variant)
talavis Feb 8, 2019
f33eb97
conversion str->int and a forgotten .execute() for the last batch
talavis Feb 8, 2019
14e3d34
hom_count parsed
talavis Feb 4, 2019
6a62d32
attempt to add beacon-only import. untested
talavis Feb 6, 2019
b83e1e9
Fix typos etc
MalinAhlberg Feb 6, 2019
8301059
make some more stuff optional
talavis Feb 6, 2019
fc49cd6
more generic vcf parsing for the beacon
MalinAhlberg Feb 6, 2019
9a37e0a
Skip lines with non-standard rsid
MalinAhlberg Feb 7, 2019
05ac2ea
fix: Restore mysql-settings.
Feb 8, 2019
5139922
variant parser updated for new db schema
talavis Feb 11, 2019
5a35aa7
and a forgotten get()
talavis Feb 11, 2019
30b56ae
a five-character string probably never matches a four-character one. …
talavis Feb 11, 2019
ce06d49
Wrong db
talavis Feb 11, 2019
b376e3d
a couple of pylint fixes
talavis Feb 11, 2019
7376aa9
slow id lookup
talavis Feb 11, 2019
25b26ec
refdata changed to local objects instead of class members
talavis Feb 11, 2019
f3381c5
forgot a self
talavis Feb 11, 2019
556e26c
update parameters for function call
talavis Feb 11, 2019
19775be
feat: Change mysql database to postgres database
Feb 13, 2019
035406b
feat: Rewrite travis tests to use postgres
Feb 13, 2019
d0d6961
Add sampleCount in import script
MalinAhlberg Feb 13, 2019
9332827
added sample_count column to model
talavis Feb 13, 2019
7c04ddb
Remove sample_count model again
MalinAhlberg Feb 14, 2019
e0ef58c
Only count samples in header, not on each data row
MalinAhlberg Feb 14, 2019
64ede98
Save sampleCount in sample_sets, add parameter to import script
MalinAhlberg Feb 14, 2019
ecb8eee
Fix typo
MalinAhlberg Feb 19, 2019
70afe46
Add parameters for assembly_id and beacon_description
MalinAhlberg Feb 19, 2019
9c4dd27
Fix small mistakes
MalinAhlberg Feb 20, 2019
1044722
Improve help messages
MalinAhlberg Feb 20, 2019
ef81b39
Add parameter for datasetsize
MalinAhlberg Feb 20, 2019
9742f12
Fix parsing of multiple rsids
MalinAhlberg Feb 21, 2019
02b8536
use beacon fix for A[NC] for all datasets without A[NC]_adj. Also che…
talavis Feb 25, 2019
34f51d9
moved ENS[GT] checks to parser function, also added set to remove dup…
talavis Feb 26, 2019
eb04242
add tracking of return value from the actual tests
talavis Feb 27, 2019
38dd02a
apparently {} are not needed on arithmetic variables
talavis Feb 27, 2019
d082b72
forgot to remove the other {}
talavis Feb 27, 2019
a351e94
added space to make the code easier to read
talavis Mar 6, 2019
28fd9f3
Log info about dryrun
MalinAhlberg Mar 5, 2019
5808409
new indexes added; sorted alphabetically
talavis Feb 27, 2019
4cf890c
forgotten ;
talavis Feb 28, 2019
668cb90
g is before t
talavis Feb 28, 2019
12d21eb
indexes for genes and transcripts associated with variants
talavis Feb 28, 2019
f7ef8e6
and indexes for variant->genes/transcripts
talavis Feb 28, 2019
0671e15
Add indices for beacon
MalinAhlberg Mar 5, 2019
9a22dd1
make sure the correct reference set is used for genes/transcripts
talavis Mar 1, 2019
7d1dbc7
refgenes/transcripts -> ref_*
talavis Mar 8, 2019
e85c2dd
added missing function
talavis Mar 8, 2019
f91f49e
dbid renamed to refset, function name updated
talavis Mar 8, 2019
2fa3408
feat: Add option to specify settings file
Mar 8, 2019
d1393e6
Adding beacon database schema
viklund Mar 14, 2019
05a60e2
Add missing quotes and make existing quotes consistent
kusalananda Mar 14, 2019
64439a6
Remove $ on variable in arithmetic context
kusalananda Mar 15, 2019
f8e7be8
feat: Remove dbSNP and OMIM data
Mar 18, 2019
e705ead
feat: Adapt beacon schemat to schema changes
Mar 18, 2019
8346951
better hom_count?
talavis Mar 18, 2019
6efde85
use minimal representation for variants
talavis Mar 18, 2019
bca9f8d
forgot to take care of unused return value
talavis Mar 18, 2019
b5a2800
the return value _is_ needed
talavis Mar 18, 2019
ebdf393
Attempt to get code to work: don't reference data before it is created
MalinAhlberg Mar 18, 2019
fc8aee2
fix hotfix
talavis Mar 18, 2019
34ca445
Merge pull request #527 from NBISweden/hot_fix/bug_fix
talavis Mar 18, 2019
f56be4a
fix for int->tuple
talavis Mar 18, 2019
46eb946
test: Fix travis tests for postgres update
Mar 19, 2019
406d59e
data->base; remove unintended debug line
talavis Mar 19, 2019
e35c557
perform coverage reformatting immediately
talavis Mar 20, 2019
e618972
a few fixes for batch management
talavis Mar 20, 2019
7488838
clarify that the values are in data.*
talavis Mar 20, 2019
63c019f
first implementation of replacement function for progress
talavis Mar 20, 2019
3e53e49
negative last_progress in raw import as well
talavis Mar 20, 2019
ed66320
added missing space
talavis Mar 20, 2019
f803efa
another fix for coverage
talavis Mar 20, 2019
c673791
possible to only import variants or coverage
talavis Mar 20, 2019
a6e75ed
removed need of the function
talavis Mar 20, 2019
3cd211b
last coverage insert did not use execute()
talavis Mar 21, 2019
dc7f872
pylint fixes
talavis Mar 22, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,13 @@ tornado/static/js/app.min.js
backend/static
backend/templates
static
# importer and config stuff
mysql-data*
scripts/importer/downloaded_files
# docker stuff
postgres-data
# local personal things
personal
# travis test remnants
master-schema.sql
settings.json.tmp
5 changes: 5 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,8 @@ install:
- pip install coverage coveralls
script:
- test/travis_script.sh
addons:
postgresql: "10"
apt:
packages:
- postgresql-client-10
9 changes: 5 additions & 4 deletions Dockerfile-backend
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
FROM ubuntu:16.04
FROM ubuntu:18.04

RUN apt-get update && apt-get install -y \
python3 \
python3-pip \
libmysqlclient-dev
python3-pip

ADD . /code
COPY settings_sample.json /settings.json
RUN sed -i 's/"postgresHost"\s*:.*,/"postgresHost" : "db",/' /settings.json
WORKDIR /code

RUN pip3 install -r backend/requirements.txt

CMD ["python3", "backend/route.py", "--develop"]
CMD ["python3", "backend/route.py", "--develop", "--settings_file", "/settings.json"]
6 changes: 6 additions & 0 deletions Dockerfile-database
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
FROM postgres:10

ENV POSTGRES_DB swefreq
COPY sql/data_schema.sql /docker-entrypoint-initdb.d/01_data_schema.sql
COPY sql/user_schema.sql /docker-entrypoint-initdb.d/02_user_schema.sql
COPY sql/beacon_schema.sql /docker-entrypoint-initdb.d/03_beacon_schema.sql
10 changes: 5 additions & 5 deletions Dockerfile-frontend-rebuilder
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM ubuntu:16.04
FROM ubuntu:18.04

RUN apt-get update && \
apt-get install -y \
Expand All @@ -7,12 +7,12 @@ RUN apt-get update && \
python3 \
python3-pip \
python3-pyinotify \
inotify-tools \
libmysqlclient-dev && \
inotify-tools && \
update-alternatives --install /usr/bin/python python /usr/bin/python3 5

RUN curl -sL https://deb.nodesource.com/setup_6.x | bash - && \
apt-get install -y nodejs
RUN apt-get install -y \
nodejs \
npm

ADD . /code
WORKDIR /code
Expand Down
4 changes: 2 additions & 2 deletions backend/application.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ def get(self, dataset, version=None):
for f in dataset_version.files:
d = db.build_dict_from_row(f)
d['dirname'] = path.dirname(d['uri'])
d['human_size'] = format_bytes(d['bytes'])
d['human_size'] = format_bytes(d['file_size'])
ret.append(d)

self.finish({'files': ret})
Expand Down Expand Up @@ -576,7 +576,7 @@ def get(self, dataset):
return

self.set_header("Content-Type", logo_entry.mimetype)
self.write(logo_entry.data)
self.write(logo_entry.data.tobytes())
self.finish()


Expand Down
Loading