Skip to content

Conversation

@bistline
Copy link
Contributor

@bistline bistline commented Oct 5, 2022

BACKGROUND & CHANGES

This fixes a bug where reading matrix slices that land exactly on a line ending result in an endless while loop that causes processing to hang. This also addresses a related issue where empty gene files named .json.gz were being created. Lastly, this removes the .gz suffix to rendered files to maintain parity with how SCP stores gzipped data in GCS buckets with Content-Encoding: gzip headers.

MANUAL TESTING

  1. Pull branch, and either initialize scp-ingest-pipeline environment via source env/bin/active or by building the Docker container with docker build -t gcr.io/broad-singlecellportal-staging/ingest-pipeline:test-candidate .
  2. Run export BYPASS_MONGO_WRITES=yes in the terminal (or Docker container) once the environment is set up
  3. Run the following command:
python ingest_pipeline.py --study-id 5d276a50421aa9117c982845 --study-file-id 5dd5ae25421aa910a723a337 \
                             render_expression_arrays --matrix-file-path ../tests/data/expression_writer/slice_testing/expression_matrix.tsv \
                             --matrix-file-type dense \
                             --cluster-file ../tests/data/expression_writer/slice_testing/cluster.tsv \
                             --cluster-name 'Slice Example' --render-expression-arrays
  1. Navigate to Slice_Example on the command line and validate that there are the following files: Adcy5.json, Agpat2.json, Agtr1.json, Aifm1.json, Apex1.json, Apoc3.json, Apoe.json and no empty file called '.json'

@bistline bistline requested review from ehanna4, eweitz and jlchang October 5, 2022 22:05
@codecov
Copy link

codecov bot commented Oct 5, 2022

Codecov Report

Base: 66.75% // Head: 66.75% // Decreases project coverage by -0.00% ⚠️

Coverage data is based on head (c87e651) compared to base (545c4a6).
Patch coverage: 66.66% of modified lines in pull request are covered.

Additional details and impacted files
@@               Coverage Diff               @@
##           development     #275      +/-   ##
===============================================
- Coverage        66.75%   66.75%   -0.01%     
===============================================
  Files               29       29              
  Lines             3974     3979       +5     
===============================================
+ Hits              2653     2656       +3     
- Misses            1321     1323       +2     
Impacted Files Coverage Δ
ingest/ingest_files.py 85.50% <0.00%> (-0.80%) ⬇️
ingest/expression_writer.py 90.56% <70.00%> (+0.05%) ⬆️
ingest/writer_functions.py 97.50% <100.00%> (+0.03%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@eweitz eweitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good! Nice refinements.

Copy link
Contributor

@ehanna4 ehanna4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional Review - Works as advertised 👍

@bistline bistline merged commit 3249e71 into development Oct 6, 2022
@bistline bistline deleted the jb-exp-writer-integration branch October 6, 2022 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants