Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐳📺🎨 Automappa web app with server upload and background pre-/post-processing services #35

Merged
merged 38 commits into from
Aug 3, 2022

Conversation

evanroyrees
Copy link
Owner

@evanroyrees evanroyrees commented Apr 24, 2022

Changes

  • Fixes 🏠 Add home tab #27
  • Addition of dash uploader where files may be uploaded to server and digested into postgres database
  • 🌻 📺 Addition of services for distributed task-queue (Celery, Flower, Prometheus, Grafana)
  • 🐳 Addition of docker-compose.yml for managing Automappa services
  • 🐳 🎨 Add Services: web, celery, postgres, rabbitmq, redis, flower, prometheus, grafana
  • Add metagenome metadata annotations (e.g. k-mer frequency embedding) to tasks for task-queue workers
  • Add geom_median function to determine each cluster's embedding's geometric median (incubating feature)
    • Each cluster's geometric median could be overlaid (or hidden, etc.) to reduce over-plotting and assess potential inclusion/exclusion criteria related to surrounding unclustered contigs

TODO: Celery task-queue tasks

Pre-processing

  • Embedding
  • GC content
  • Coverage
  • Taxonomy
  • ...

Post-processing

  • CheckM
  • GTDB-Tk
  • METABOLIC
  • ...

📝🎨 Add Bioconda project url to setup.py
🎨 Add toast when browser storage limits are reached (issue #11)
:art::memo: Move favicon to automappa/assets
:arrow_up::art: Add geom-median to environment.yml
:art: Add get_clusters_geom_medians(...) to celery tasks
:arrow_up::art: Bump VERSION to 2.2.0
:see_no_evil: Add data to .gitignore
Add TODOs for WIP next steps... To be checked before merge into develop
🎨🐛 Prevent samples-store data from being overwritten on initial callback
🎨 samples-store data now persistent with local session
🎨 Refactor some variables in MAG refinement
🎨 WIP
… MetaData.table.keys()

Should now be able to 🔥 remove all of the data stores to track the fileupload table_id... 🔥
🐳🌻🐘🎨 Add docker-services: celery, flower, prometheus, grafana, redis, rabbitmq
🎨🔥 Remove environment/*.env files. Consolidate to .env with corresponding prefixes
🎨 Change pydantic models built on BaseSettings in automappa.settings to use only .env with env_prefix
🎨🌻 Add working configuration of celery task-queue with redis backend and rabbitMQ broker
📝 Add resources/references for misc. links used during debugging/troubleshooting
⬆️ Add flower. Move celery[redis] to pip install section of environment.yml
🎨 black formatting to serializers and change server.upload_folder root to server.root_upload_folder
🎨 blak formatting to home.py
🎨 dbc.Select(...) are now populated with values from db tables
🐛🎨 Investigate whether serializer.get_uploaded_files_table() is functional (TODO)
🐳 Add commented out arg to create_engine(..., pre_pool_ping: Optional[bool]) and db settings
🐳 Add PRE_POOL_PING to .env
🔥 Remove (now deprecated) stores
🎨 Add serialization of a hash-refinement table when hash-binning file is uploaded (respective to hash-binning)
🎨 Remove --debug arg from parser. Move statement to .env (now handle read with automappa.settings.server.debug
🔥 remove unused imports in index.py
🎨 Move samples stores from home.py to index.py
Add dbc.InputGroup(...) with selects for binning, markers and metagenome
🎨 Add dbc.Button(...) to save table-ids in 'selected-tables-store' to be read from mag_refinement.py
🎨🐛 Continue refactor of get_scatterplot_2d(...) to use other embedding methods
🎨 Add get_marker_symbols(...) to tasks.py and automappa.utils.markers (WIP currently being used to *expensively* generate symbols during 2d scatterplot creation
🔥 Remove unused (commented out) argparse args
🔥🎨🐳 Update automappa-web service command to not use removed argparse args
Will still need to ensure this is working once celery tasks are implemented
🐛🔥 Fix sankey diagram subsetting bug where selections would not render
🎨 Change MAG-specific completeness/purity boxplot to barplot
🎨 Rename save_to_db(...) to file_to_db(...) in serializer
🎨 Add metric_barplot(...) func to automappa.utils.figures
- 🐳🎨 Add mem_limit: 4GB to automappa-web service
- 🎨🐛 Remove ability for user to navigate to other tabs when no data has been uploaded/selected
- 🎨 Disable 'Refine mags' button if data has not been selected
- 🎨 Add DataTable(...) to visualize the currently loaded datasets
- 🎨 Change server settings import so it is clear that the server setting are imported and not the server
- 🎨🥕 Silence numba and h5py loggers in mag_refinement.py (beginnings of celery task-queue tasks implementation)
 - 🎨 Add logger to automappa.tasks
- 🐛 Fix logger emitted message during refinement data serialization to postgres db
…g methods

🎨🐛 Need to replace kmers.embed(...) method in tasks.py to include n_jobs=1 when autometa 2.0.4 release is out
🎨🥦 Add celeryconfig
🎨🥦 Add working kmer freq. analysis parallelize pipeline for use in task-queue
🎨 Add use of pydantic.BaseModel in models.py of SampleTables, KmerTable, AnnotationTable and misc. others (WIP)
🎨 Add construction of SeqIO.SeqRecords when retrieving metagenome from table using new func, get_metagenome_seqrecords(...)
🎨 Refactor get_contig_marker_counts(...)
🎨 Refactor get_marker_symbols(...) in markers.py
🎨🔥 Remove hard-coded values in automappa.utils.figures.get_embedding_traces_df(...)
🎨 Parse selected_data_tables as SampleTables.parse_raw(selected_data_tables) for shared data
🎨 Add tasks button to home.py
🎨 Replace typehints for selected_data_tables to Json[SampleTables]
@evanroyrees evanroyrees self-assigned this May 16, 2022
@evanroyrees evanroyrees added enhancement New feature or request aesthetic Improvements or additions to aesthetics performance Performance Improvements to speed-up user experience labels May 16, 2022
🎨 Fetch data from postgres service
🎨🥦 Retrieve scatterplot-2d coordinates from celery task-queue results (selected from auto-generated dropdown)
🎨🌻 Add flower configuration env variable to settings.py
🎨 generate product of all combinations of kmer pipeline params (kmer_size, norm_method, embed_method) for 2d-coords retrieval
🎨🐎🐛 Retrieve mag summary data from SampleTables properties instead of using get_table(...)
🔥🐛 Remove default value of 'cluster' for mag_summary_cluster_col_dropdown
🎨🔥 Remove unnecessary nested lists of Input(...) in mag_summary callbacks
🎨 Add long_callback_manager to app (WIP: currently not being used)
🎨 Add handling of kmer param titles and metagenome annotation titles to format_axis_title(...) in figures.py
🎨🐳🐰 Change rabbitmq image from latest to 3.10.1-management-alpine (w/port) to use amqp management UI
📝 Move misc. markdown docs to docs directory
🐳 Ignore .vscode for docker context
🐳 Install mamba and prune prior to env update
💚🐎 Add dash-extensions to enable caching
🔥 Reduce kmer tasks to am_clr norm method and kmer_size of 5 (bhsne, densmap and umap embeddings)
.env Outdated Show resolved Hide resolved
Turn off debug mode by default
README.md Outdated Show resolved Hide resolved
🔥 Change clone of `home-tab` branch to `develop` branch
@evanroyrees evanroyrees merged commit c245bc2 into develop Aug 3, 2022
@evanroyrees evanroyrees deleted the home-tab branch August 3, 2022 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aesthetic Improvements or additions to aesthetics enhancement New feature or request performance Performance Improvements to speed-up user experience
Projects
None yet
1 participant