Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GFF3 support in filter by attribute #3076

Merged
merged 3 commits into from Jun 26, 2017

Conversation

Projects
None yet
6 participants
@peterjc
Copy link
Contributor

commented Oct 21, 2016

Pull request to close #3067 and #3068, including simple test filtering a GFF3 file.

@peterjc

This comment has been minimized.

Copy link
Contributor Author

commented Oct 21, 2016

It looks like the main Galaxy repository does not include any TravisCI testing?

[Update - It was just being slow, tests started now]

This was tested with planemo as per galaxyproject/planemo#594 because some of the test files are now in https://github.com/galaxyproject/galaxy-test-data - see #3065.

@galaxybot galaxybot added the triage label Oct 21, 2016

@galaxybot galaxybot added this to the 16.10 milestone Oct 21, 2016

@jmchilton

This comment has been minimized.

Copy link
Member

commented Oct 31, 2016

Looks awesome to me - I'll give say @blankenberg or @bgruening a couple more days to review and then merge if neither of them weigh in on this.

if pair == '':
# Split on first equals (GFF3) or space (legacy)
name_value_pair = name_value_pair.strip()
i = name_value_pair.replace(" ", "=").find("=")

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Oct 31, 2016

Member

According to http://gmod.org/wiki/GFF3, space are allowed in the attributes column, so this would not work on some GFF3 files.

This comment has been minimized.

Copy link
@peterjc

peterjc Nov 3, 2016

Author Contributor

Sigh. You're probably right, but I don't see an easy fix for this. I would prefer a top level switch based on the presence of the explicit GFF3 header between two parsers - but the whole "eval" implementation would make this a much larger change.

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Nov 3, 2016

Member

You can check if the Galaxy datatype is 'gff3' (using $input.ext).

This comment has been minimized.

Copy link
@jmchilton

jmchilton Nov 16, 2016

Member

Marking as WIP until this is resolved - will probably be pushed to 17.01.

@martenson

This comment has been minimized.

Copy link
Member

commented Nov 16, 2016

@jmchilton are you still willing to merge this?

@bgruening

This comment has been minimized.

Copy link
Member

commented Jan 8, 2017

How do we want to proceed here?

@peterjc

This comment has been minimized.

Copy link
Contributor Author

commented Jan 11, 2017

I presume the merge conflict is from fixing the security vulnerability found by David Wyde. I can see why eval was used here, but it also makes better file format support hard as per this fix.

@martenson martenson modified the milestones: 17.05, 17.01 Jan 12, 2017

@martenson martenson removed this from the 17.05 milestone Apr 26, 2017

@bgruening

This comment has been minimized.

Copy link
Member

commented Jun 24, 2017

ping @peterjc

@peterjc

This comment has been minimized.

Copy link
Contributor Author

commented Jun 26, 2017

The separator issue noted by @nsoranzo makes a robust parser doing both GFF and GTF (update: I should have said legacy GFF, but that kind of includes GTF) at the same time impossible (as written my change would break on some examples). This would need to detect the file format from the headers and/or the Galaxy datatype, and process the final column accordingly. That's all fine in theory but the eval implementation makes it cumbersome.

@nsoranzo

This comment has been minimized.

Copy link
Member

commented Jun 26, 2017

@peterjc Agreed, that eval should be rewritten, we had already a couple of security-related issues with it.

peterjc added some commits Oct 21, 2016

Ignore blank lines and comments in GFF files
Previously would wrongly issue warnings about valid # lines.
Closes #3067
Cope with GFF3 as well as legacy GFF.
In GFF3 have key=value rather than space separated
as in legacy GFF and GTF format. Closes #3068.

@peterjc peterjc force-pushed the peterjc:gff3_filter_by_attribute branch from 6fc99cf to 005f3d9 Jun 26, 2017

@peterjc

This comment has been minimized.

Copy link
Contributor Author

commented Jun 26, 2017

I have rebased this old work of mine to resolve the merge conflict (and update the test results), which is arguably still an improvement. I wonder if rather than attempting to handle multiple versions of GFF the tool's input could be restricted to only old (non GFF3) legacy GFF style files?

However, I agree this tool needs a full rewrite to avoid using eval. I am not volunteering for that.

@nsoranzo

This comment has been minimized.

Copy link
Member

commented Jun 26, 2017

Thanks @peterjc, solid improvements! If we find more issues, we can fix them later on.

@bgruening

This comment has been minimized.

Copy link
Member

commented Jun 26, 2017

Can only second @nsoranzo! Looks good! Thanks!

@bgruening bgruening merged commit 46dbf8d into galaxyproject:dev Jun 26, 2017

5 checks passed

api test Build finished. 279 tests run, 0 skipped, 0 failed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
framework test Build finished. 150 tests run, 0 skipped, 0 failed.
Details
integration test Build finished. 34 tests run, 0 skipped, 0 failed.
Details
toolshed test Build finished. 579 tests run, 0 skipped, 0 failed.
Details
@peterjc

This comment has been minimized.

Copy link
Contributor Author

commented Jun 26, 2017

Thanks all - let's hope this helps, and someone will take on the re-write later on.

@peterjc peterjc deleted the peterjc:gff3_filter_by_attribute branch Jun 26, 2017

@nsoranzo nsoranzo added this to the 17.09 milestone Jun 26, 2017

@nsoranzo nsoranzo removed the status/WIP label Jun 26, 2017

@galaxyproject galaxyproject deleted a comment from galaxybot Jun 27, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.