Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an ADAM Python API #538

Closed
laserson opened this issue Jan 6, 2015 · 6 comments
Closed

Create an ADAM Python API #538

laserson opened this issue Jan 6, 2015 · 6 comments
Assignees
Labels
Milestone

Comments

@laserson
Copy link
Contributor

@laserson laserson commented Jan 6, 2015

No description provided.

@laserson laserson added the enhancement label Jan 6, 2015
@laserson laserson self-assigned this Jan 6, 2015
@milos-popovic
Copy link

@milos-popovic milos-popovic commented Apr 22, 2015

@laserson Hi, may I ask how is this project progressing? I've looked at https://github.com/laserson/adam/tree/ADAM-538-python, but it only has one commit and has been inactive for some time now. I was interested if the development has moved elsewhere, or it's on hold currently.

PS: I'm quite keen to get ADAM working in Python, so I was interested in contributing if I manage to make it running. Though, I'm currently in the phase of familiarizing with the ADAM and PySpark codebases so it'll take some time until I get there.

@fnothaft
Copy link
Member

@fnothaft fnothaft commented Apr 23, 2015

@milos-popovic one thing that might be easiest to do is to use the ADAM core to ETL data in to Parquet, and to then load data up using the Spark Dataframes API which is supported in Scala/Java/Python/R.

@milos-popovic
Copy link

@milos-popovic milos-popovic commented May 11, 2015

@fnothaft Thanks for the quick reply, and sorry for the late one on my side. I have started using the Dataframes API for querying the ADAM parquet files. As for the PyADAM, is it still planned for development?

@laserson
Copy link
Contributor Author

@laserson laserson commented May 12, 2015

This is definitely still planned. As you mentioned, using the dataframes API is the best workaround at the moment, except you don't get access to the region-join impls in ADAM. This will definitely be the primary focus of getting the Python API up-and-running.

@milos-popovic
Copy link

@milos-popovic milos-popovic commented May 14, 2015

@laserson That's good to know, though I'm not in a huge need for region-joins currently and DataFrame API is quite sufficient. I'll still be on the lookout for the project.

@nyetsche
Copy link

@nyetsche nyetsche commented Jul 8, 2016

This if far from complete, but thanks to py4j already being included by pyspark, it's possible to interface with the ADAM code. Here's how:

# Get a list of the ADAM jars, it's okay to ignore the testing ones. Convert that list to a CLASSPATH style with : as the separator
$ export CLASSPATH=$(find $ADAM_HOME -name '*SNAPSHOT.jar' | tr '\n' ':')
# start pyspark
$ pyspark
[...]
SparkContext available as sc, HiveContext available as sqlContext.
>>> 
# Grab a copy of the JVM from the sparkcontext
>>> j = sc._jvm
# Grab a copy of the ADAM context through the Java API. NB: pass sc._jsc in as the argument, not sc which is a python object
>>> q = j.org.bdgenomics.adam.apis.java.JavaADAMContext(sc._jsc)
# Run ADAMContext methods
>>> a = q.loadAlignments("/tmp/small.adam")

Unfortunately I haven't found a good way to get Python to introspect the Java objects, so you'd need to cross-reference with the Java/Scala definitions.

@fnothaft fnothaft mentioned this issue Feb 10, 2017
3 of 5 tasks complete
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 15, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 16, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 17, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 17, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 17, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 17, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Feb 17, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
@fnothaft fnothaft added this to the 0.23.0 milestone Mar 3, 2017
@heuermh heuermh added this to Triage in Release 0.23.0 Mar 8, 2017
fnothaft added a commit to fnothaft/adam that referenced this issue Apr 10, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue May 11, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue May 11, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue May 12, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue May 22, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue May 24, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 21, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 22, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 23, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 23, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 26, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 26, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jun 26, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jul 7, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
fnothaft added a commit to fnothaft/adam that referenced this issue Jul 10, 2017
Resolves bigdatagenomics#538. Adds support for Python APIs that use the ADAM Java API to make
the ADAMContext and RDD functions accessible natively through python.
@heuermh heuermh closed this in c34a440 Jul 11, 2017
@heuermh heuermh moved this from Triage to Completed in Release 0.23.0 Jan 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.