New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an ADAM Python API #538

Closed
laserson opened this Issue Jan 6, 2015 · 6 comments

Comments

Projects
4 participants
@laserson
Contributor

laserson commented Jan 6, 2015

No description provided.

@laserson laserson added the enhancement label Jan 6, 2015

@laserson laserson self-assigned this Jan 6, 2015

@milos-popovic

This comment has been minimized.

Show comment
Hide comment
@milos-popovic

milos-popovic Apr 22, 2015

@laserson Hi, may I ask how is this project progressing? I've looked at https://github.com/laserson/adam/tree/ADAM-538-python, but it only has one commit and has been inactive for some time now. I was interested if the development has moved elsewhere, or it's on hold currently.

PS: I'm quite keen to get ADAM working in Python, so I was interested in contributing if I manage to make it running. Though, I'm currently in the phase of familiarizing with the ADAM and PySpark codebases so it'll take some time until I get there.

milos-popovic commented Apr 22, 2015

@laserson Hi, may I ask how is this project progressing? I've looked at https://github.com/laserson/adam/tree/ADAM-538-python, but it only has one commit and has been inactive for some time now. I was interested if the development has moved elsewhere, or it's on hold currently.

PS: I'm quite keen to get ADAM working in Python, so I was interested in contributing if I manage to make it running. Though, I'm currently in the phase of familiarizing with the ADAM and PySpark codebases so it'll take some time until I get there.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 23, 2015

Member

@milos-popovic one thing that might be easiest to do is to use the ADAM core to ETL data in to Parquet, and to then load data up using the Spark Dataframes API which is supported in Scala/Java/Python/R.

Member

fnothaft commented Apr 23, 2015

@milos-popovic one thing that might be easiest to do is to use the ADAM core to ETL data in to Parquet, and to then load data up using the Spark Dataframes API which is supported in Scala/Java/Python/R.

@milos-popovic

This comment has been minimized.

Show comment
Hide comment
@milos-popovic

milos-popovic May 11, 2015

@fnothaft Thanks for the quick reply, and sorry for the late one on my side. I have started using the Dataframes API for querying the ADAM parquet files. As for the PyADAM, is it still planned for development?

milos-popovic commented May 11, 2015

@fnothaft Thanks for the quick reply, and sorry for the late one on my side. I have started using the Dataframes API for querying the ADAM parquet files. As for the PyADAM, is it still planned for development?

@laserson

This comment has been minimized.

Show comment
Hide comment
@laserson

laserson May 12, 2015

Contributor

This is definitely still planned. As you mentioned, using the dataframes API is the best workaround at the moment, except you don't get access to the region-join impls in ADAM. This will definitely be the primary focus of getting the Python API up-and-running.

Contributor

laserson commented May 12, 2015

This is definitely still planned. As you mentioned, using the dataframes API is the best workaround at the moment, except you don't get access to the region-join impls in ADAM. This will definitely be the primary focus of getting the Python API up-and-running.

@milos-popovic

This comment has been minimized.

Show comment
Hide comment
@milos-popovic

milos-popovic May 14, 2015

@laserson That's good to know, though I'm not in a huge need for region-joins currently and DataFrame API is quite sufficient. I'll still be on the lookout for the project.

milos-popovic commented May 14, 2015

@laserson That's good to know, though I'm not in a huge need for region-joins currently and DataFrame API is quite sufficient. I'll still be on the lookout for the project.

@nyetsche

This comment has been minimized.

Show comment
Hide comment
@nyetsche

nyetsche Jul 8, 2016

This if far from complete, but thanks to py4j already being included by pyspark, it's possible to interface with the ADAM code. Here's how:

# Get a list of the ADAM jars, it's okay to ignore the testing ones. Convert that list to a CLASSPATH style with : as the separator
$ export CLASSPATH=$(find $ADAM_HOME -name '*SNAPSHOT.jar' | tr '\n' ':')
# start pyspark
$ pyspark
[...]
SparkContext available as sc, HiveContext available as sqlContext.
>>> 
# Grab a copy of the JVM from the sparkcontext
>>> j = sc._jvm
# Grab a copy of the ADAM context through the Java API. NB: pass sc._jsc in as the argument, not sc which is a python object
>>> q = j.org.bdgenomics.adam.apis.java.JavaADAMContext(sc._jsc)
# Run ADAMContext methods
>>> a = q.loadAlignments("/tmp/small.adam")

Unfortunately I haven't found a good way to get Python to introspect the Java objects, so you'd need to cross-reference with the Java/Scala definitions.

nyetsche commented Jul 8, 2016

This if far from complete, but thanks to py4j already being included by pyspark, it's possible to interface with the ADAM code. Here's how:

# Get a list of the ADAM jars, it's okay to ignore the testing ones. Convert that list to a CLASSPATH style with : as the separator
$ export CLASSPATH=$(find $ADAM_HOME -name '*SNAPSHOT.jar' | tr '\n' ':')
# start pyspark
$ pyspark
[...]
SparkContext available as sc, HiveContext available as sqlContext.
>>> 
# Grab a copy of the JVM from the sparkcontext
>>> j = sc._jvm
# Grab a copy of the ADAM context through the Java API. NB: pass sc._jsc in as the argument, not sc which is a python object
>>> q = j.org.bdgenomics.adam.apis.java.JavaADAMContext(sc._jsc)
# Run ADAMContext methods
>>> a = q.loadAlignments("/tmp/small.adam")

Unfortunately I haven't found a good way to get Python to introspect the Java objects, so you'd need to cross-reference with the Java/Scala definitions.

@fnothaft fnothaft referenced this issue Feb 10, 2017

Closed

WIP Python API #1387

3 of 5 tasks complete

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 15, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 17, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 17, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 17, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 17, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Feb 17, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

@fnothaft fnothaft added this to the 0.23.0 milestone Mar 3, 2017

@heuermh heuermh added this to Triage in Release 0.23.0 Mar 8, 2017

fnothaft added a commit to fnothaft/adam that referenced this issue Apr 10, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue May 11, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue May 11, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue May 12, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue May 22, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue May 24, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 21, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 23, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 23, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 26, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 26, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jun 26, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jul 7, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

fnothaft added a commit to fnothaft/adam that referenced this issue Jul 10, 2017

[ADAM-538] Add support for an adam-python API.
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.

@heuermh heuermh closed this in c34a440 Jul 11, 2017

@heuermh heuermh moved this from Triage to Completed in Release 0.23.0 Jan 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment