Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion cloudbuild.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

# Builds docker image for gcp-variant-transforms:
# Run using:
# $ gcloud container builds submit --config cloudbuild.yaml --timeout 1h .
# $ gcloud builds submit --config cloudbuild.yaml --timeout 1h .
substitutions:
_CUSTOM_TAG_NAME: 'latest'
steps:
Expand Down
3 changes: 2 additions & 1 deletion deploy_and_run_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -207,8 +207,9 @@ if [[ -z "${skip_build}" ]]; then
# TODO(bashir2): This will pick and include all directories in the image,
# including local build and library dirs that do not need to be included.
# Update this to include only the required files/directories.
gcloud container builds submit --config "${build_file}" \
gcloud builds submit --config "${build_file}" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify: is this the new way of using cloud build?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when I run the deploy_and_run_tests.sh I got the following error message:

ERROR: (gcloud.container.builds.submit) This command has been replaced with `gcloud builds`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for fixing this! Could you please update the comment in cloudbuild.yaml as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

--project "${project}" \
--timeout '30m' \
--substitutions _CUSTOM_TAG_NAME="${image_tag}" .
fi

Expand Down
7 changes: 5 additions & 2 deletions gcp_variant_transforms/beam_io/vcfio_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,11 +101,14 @@ def _get_sample_variant_1(is_for_nucleus=False):
vcfio.VariantCall(name='Sample2', genotype=[1, 0], info={'GQ': 20}))
else:
# 0.1 -> 0.25 float precision loss due to binary floating point conversion.
vcf_line = ('20 1234 rs123;rs2 C A,T 50 '
# rs123;rs2 -> rs123 it seems nuclues does not parse IDs correctly.
# quality=50 -> 50.0 nucleus converts quality values to float.
# TODO(samanvp): convert all quality values to float.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allieychen FYI
I actually noticed that we have the same problem for the BQ -> VCF pipeline. The VCF quality values are typically integers, but the spec has float. So, while technically correct, our output VCF files have #.0 for the quality field.
I think a better solution is to add some logic to our parser to output these as ints if the value is actually an integer (an easy way to do this is: return quality if int(quality) != quality else int(quality) (with some special casing when quality is None). Let's do this in another PR though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Thanks!

vcf_line = ('20 1234 rs123 C A,T 50 '
'PASS AF=0.5,0.25;NS=1 GT:GQ 0/0:48 1/0:20\n')
variant = vcfio.Variant(
reference_name='20', start=1233, end=1234, reference_bases='C',
alternate_bases=['A', 'T'], names=['rs123', 'rs2'], quality=50,
alternate_bases=['A', 'T'], names=['rs123'], quality=50.0,
filters=['PASS'], info={'AF': [0.5, 0.25], 'NS': 1})
variant.calls.append(
vcfio.VariantCall(name='Sample1', genotype=[0, 0], info={'GQ': 48}))
Expand Down
22 changes: 3 additions & 19 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,7 @@

"""Beam pipelines for processing variants based on VCF files."""

import os
import setuptools
from setuptools.command.build_py import build_py

REQUIRED_PACKAGES = [
'cython>=0.28.1',
Expand All @@ -28,6 +26,9 @@
'google-api-python-client>=1.6',
'intervaltree>=2.1.0,<2.2.0',
'pyvcf<0.7.0',
'google-nucleus==0.2.0',
# Nucleus needs uptodate protocol buffer compiler (protoc).
'protobuf>=3.6.1',
'mmh3<2.6',
# Need to explicitly install v<=1.2.0. apache-beam requires
# google-cloud-pubsub 0.26.0, which relies on google-cloud-core<0.26dev,
Expand All @@ -45,20 +46,6 @@
'nose>=1.0',
]


class BuildPyCommand(build_py):
"""Custom build command for installing libraries outside of PyPi."""

_NUCLEUS_WHEEL_PATH = (
'https://storage.googleapis.com/gcp-variant-transforms-setupfiles/'
'nucleus/Nucleus-0.1.0-py2-none-any.whl')

def run(self):
# Temporary workaround for installing Nucleus until it's available via PyPi.
os.system('pip install {}'.format(BuildPyCommand._NUCLEUS_WHEEL_PATH))
build_py.run(self)


setuptools.setup(
name='gcp_variant_transforms',
version='0.5.1',
Expand Down Expand Up @@ -90,7 +77,4 @@ def run(self):
package_data={
'gcp_variant_transforms': ['gcp_variant_transforms/testing/testdata/*']
},
cmdclass={
'build_py': BuildPyCommand,
},
)