Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clearer error message when using min-overlap with an anchored, non-linked adapter (^ADAPTER;o=12) #592

Closed
DriesSchaumont opened this issue Feb 4, 2022 · 9 comments

Comments

@DriesSchaumont
Copy link

Version information:

  • Cutadapt 3.5
  • Python 3.8
  • Installed using pip

I am not providing an example input read or the command-line parameters used, because I am not reporting a bug. This is just a request for an improved error message.

Recently, I bumped the cutadapt version from 3.3 to 3.5 as I use it as dependency for a project.
We are now seeing the error reported here, introduces by #544

raise ValueError("Setting min_overlap/o for anchored adapters is not possible")

I took us a while to figure out why we could no longer use the minimum overlap like this (I had to actually look at the code)
I propose to improve this error message, because the error message can be easily misinterpreted: that min-overlap is not possible at all with non-linked adapters, while it is still possible to use a global min-overlap option (-O, capital O) (if I am not mistaken?). Also, it is only not possible when using a non-linked adapter.

Perhaps something like: Setting min_overlap/o for anchored adapters is not possible when using a non-linked adapter, use --overlap, -O instead? I will try to open a PR.

@marcelm
Copy link
Owner

marcelm commented Feb 4, 2022

Which type of adapter did you specify?

I can cause the message to be printed by running cutadapt -g '^ACGT;min_overlap=5' -. Is that what you meant?

Edit: It’s right in the title, didn’t see that.

@marcelm
Copy link
Owner

marcelm commented Feb 4, 2022

can be easily misinterpreted: that min-overlap is not possible at all with non-linked adapters, while it is still possible to use a global min-overlap option (-O, capital O) (if I am not mistaken?).

Hm, I’m confused.

The point is that the minimum overlap only makes sense for adapter types that allow partial occurrences. Anchored adapters must always occur in full, so setting a minimum overlap is not useful.

So writing -g ^ADAPTER;o=12 doesn’t make sense and instead of silently ignoring the o=12, Cutadapt complains.

You can still use a global -O 12, but it will be ignored for the anchored adapters.

@marcelm
Copy link
Owner

marcelm commented Feb 4, 2022

There’s no difference whether an anchored adapter is specified by itself or as a component of a linked adapter. So you’ll get the same message when you write cutadapt -a '^ACGT;o=3...TGA' -.

There’s definitely something that needs to be fixed in the message, but I’m trying to find out what the actual confusion is.

@DriesSchaumont
Copy link
Author

It seems that I jumped the gun a bit too quickly by assuming that this was about linked adapters. I now understand that it is not, my apologies.

The missing piece of the puzzle for me was that an anchored adapters always need to present in full.
It seems that I misinterpreted the examples in the doc here:

https://cutadapt.readthedocs.io/en/stable/guide.html#anchored-5-adapters

I thought that

ADAPTER
ADAPT
ADA

actually meant

ADAPTERrestofread
ADAPTrestofread
ADArestofread

But of course, I missed the most important line:
The read will simply be empty after trimming.

Knowing this, I think the error message could just be improved to something like:
Setting min_overlap/o for anchored adapters is not possible. It is always tried to match them in full.

@marcelm
Copy link
Owner

marcelm commented Feb 4, 2022

Thanks, I’ll update both the documentation and the error message.

@DriesSchaumont
Copy link
Author

Thanks, I’ll update both the documentation and the error message.

Thanks! Can I also suggest to add this to the changelog?

@marcelm
Copy link
Owner

marcelm commented Feb 4, 2022

Please see commit cc82a8a for the improved error message and the changelog entry.

The updated section in the docs is online at https://cutadapt.readthedocs.io/en/latest/guide.html#anchored-5-adapters.

I noticed that the paragraph in the documentation that you quoted is actually wrong: If a read is shorter than the adapter and starts with a part of the anchored 5' adapter, this is not found. It’s an edge case that no one has complained about, so I’ve now adjusted the documentation instead of the behavior.

I have also updated a couple of other places to clarify that specifying a minimum overlap for anchored adapters is not useful and is either ignored or leads to an error.

@DriesSchaumont
Copy link
Author

That seems clear to me, thank you so for the quick response!

@marcelm
Copy link
Owner

marcelm commented Feb 4, 2022

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants