Skip to content

Conversation

@jamesmcm
Copy link

Clarify invalid bases problem - see exercism/python#916

Clarify invalid bases problem
@rpottsoh rpottsoh changed the title Clarify rna-transcription rna-transcription: Clarify problem description Oct 15, 2017
rpottsoh
rpottsoh previously approved these changes Oct 15, 2017
Copy link
Member

@rpottsoh rpottsoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how necessary these changes are. However, they are aligned with the canonical data so I think adding these lines is fine. Thanks @jamesmcm for taking the time to help out. 👍


Note that since we are finding the complement, you should use the mapping of pairs above (and not just swap T for U).

Your function will need to be able to handle invalid inputs, both partial (a few invalid bases) and completely invalid strings. In these cases it should return an empty string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In these cases it should return an empty string.

No! No! No!

Returning the empty string is a valid complement for the empty sequence as input.

Yes, it is part of this exercise to handle invalid input, because we use a string as input which has a much larger domain than just Nucleotides.

We have choosen to do this even on languages that could completely avoid invalid sequences because of their typesystem, because we treat it as a parsing and string transformation exercise.

But please, lets use the idiomatic error mechanisms of the implementing language. Lets throw something in Java, raise in Ruby, return an error-tuple/atom in elixir, erlang and LFE, lets have a Result- or Either-, or Option-, or Maybe-type in Rust, OCaml, Haskell, Idris, return an error-value in Go, do whatever is idiomatic in Python, but please do not force to use something unidiomatic via the description.

Of course tracks can choose to simply expect the empty string to signal errors, because they do not want to introduce proper error handling in this exercise but it should be the tracks maintainers who choose, not the description. Error handling is a subject that has to be treatened carefully…

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, the empty string is what passes the test in Python though, which is how I found this issue (I had no idea we even had to handle invalid cases).

Perhaps this should be changed there, and the descriptions changed in both to reflect that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please PR against their HINTS.md in that exercise with an accompanied regenerated README.md to describe the error handling part, but leave it here only as “There may be errors in the input which you need to detect and act accordingly” or something like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NobbZ, you're right regarding the error handling - I've submitted a PR to fix the Python implementation.

* `T` -> `A`
* `A` -> `U`

Note that since we are finding the complement, you should use the mapping of pairs above (and not just swap T for U).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you actually want to say with this sentence…

It is semantically the same as the sentence in front of the mapping, but pointing to the mapping above, instead of below:

Given a DNA strand, its transcribed RNA strand is formed by replacing each nucleotide with its complement:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but in the sourced Rosalind exercise the problem is different. It was just to clarify that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a completely different thing and a big bummer!

This means, that either our exercise or the rosalind exercise in incorrect in terms of biology.

We need to find this out, would you mind open a clean ticket in this repository explaining the circumstances?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NobbZ, in the Rosalind exercise, they are converting the coding strand to RNA, but I find that a very strange approach. Typically people consider transcription with reference to the template strand (which is the complement of the coding strand), and this is definitely what most people would think of if asked to write an RNA transcription program without being given further detail. Regardless, I think the current exercise is probably clear enough about how we are doing it, since it gives a transcription table, and neither approach is technically wrong.


Note that since we are finding the complement, you should use the mapping of pairs above (and not just swap T for U).

Your function will need to be able to handle invalid inputs, both partial (a few invalid bases) and completely invalid strings. In these cases it should return an empty string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please PR against their HINTS.md in that exercise with an accompanied regenerated README.md to describe the error handling part, but leave it here only as “There may be errors in the input which you need to detect and act accordingly” or something like that.

@NobbZ
Copy link
Member

NobbZ commented Oct 15, 2017

Sorry for double creating that post… I seem to have clicked the wrong button…

Copy link
Contributor

@Insti Insti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this PR @jamesmcm

I realise you were following some guidance which suggested making this PR, which is often good advice for changes to track READMEs, but in this case I agree with @NobbZ and don't think it is appropriate to make these changes here.

I can't see any way to change this PR to be more mergeable so I suggest closing it.

@rpottsoh
Copy link
Member

Thanks @NobbZ and @Insti for weighing in here. I generally like the proposed changes, but I also agree with your assessments of how the changes would impact the tracks that implement this exercise.

I don't see anyway to change this PR to make it suitable for merging.

@rpottsoh rpottsoh dismissed their stale review October 15, 2017 22:35

I now fully realize the negative implications these changes would have on tracks that implement this exercise. I wasn't thinking along those lines when I approved the PR.

@Insti
Copy link
Contributor

Insti commented Oct 28, 2017

This appears to have now been handled by the Python track.

@Insti Insti closed this Oct 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants