Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rna-transcription -- weird non-biological thing slipped into this problem #148

Closed
kytrinyx opened this issue Dec 2, 2015 · 33 comments
Closed

Comments

@kytrinyx
Copy link
Member

kytrinyx commented Dec 2, 2015

I can't remember what happened but we've ended up with some tracks that do the "transcription" both ways, DNA to RNA and also RNA to DNA. Since, biologically speaking, it only goes from DNA to RNA, we should probably figure out which language tracks have the backwards thing implemented and make a note to fix it.

@behrtam
Copy link
Contributor

behrtam commented Dec 2, 2015

@kytrinyx
Copy link
Member Author

kytrinyx commented Dec 2, 2015

thanks

@mhelmetag
Copy link

So this should just be a matter of simplifying the exercises by taking out backwards transcription? I could start with ruby. I could also open issues in each language track where this occurs.

@kytrinyx
Copy link
Member Author

Yepp, that should be enough. Opening an issue in each track would be great, thanks!

@mhelmetag
Copy link

I can grab this issue for the ruby and js implementations and update the test and example files accordingly. I could take a crack at the swift, fsharp and csharp ones but... no promises that they'll be idiomatic 😆

Also I'll probably check around and see if they exist in the other language tracks.

Oh well... this will simplify this exercise but there are a bunch of other cool bio related exercises also from Rosalind that we can pull in. Also, this may lower the bar of entry a bit for people getting into exercism... so that's always a plus.

@kytrinyx
Copy link
Member Author

Here's the list of languages that have implemented the exercise: http://x.exercism.io/problems/rna-transcription

@rchavarria
Copy link

I checked xecmascript track and it only implements DNA to RNA (rna-transcription exercise).

@mhelmetag
Copy link

Sweet! 👍

This was referenced Apr 7, 2016
@kytrinyx
Copy link
Member Author

kytrinyx commented Apr 7, 2016

Yeah. Let's leave it to the conversation on the site to talk about hard-coded values vs other approaches. I'll fix the JSON.

kytrinyx added a commit that referenced this issue Apr 7, 2016
There is no biological basis for DNA->RNA transcription.

A couple of people suggested that we could rephrase this and make the
question _Given this RNA, what DNA sequence could produce it?_, but
it seemed simpler to simply remove it.

There is also the question of whether or not having the reverse
transcription helps guide people to a better/more generic solution.
I think that this is one place where we could have the discussion on the
website about hard-coding vs not.

See #148
@chezwicker
Copy link

I always thought this made sense from the programming side even if it's nonsense in terms of biology: it makes sure the solution doesn't hard code values but abstracts by e.g. defining source -> target "maps" that can be passed to the algo used.

I would value this more than consistency with reality. Maybe we could rephrase and state that sometimes we find isolated RNA and want to know what DNA strand it resulted from...

@kytrinyx
Copy link
Member Author

kytrinyx commented Apr 7, 2016

@chezwicker You're right that it helps avoid hard-coding. I'm not sure if it's enough to make this part of the discussion. Or perhaps we can add it to the README as an extension or "something to think about".

@chezwicker
Copy link

Sure, that works as well - I guess it goes back to the old discussion about whether comments / documentation are a good thing. My view is that they are as long as what they are saying can't be expressed in code, but aren't otherwise. I would have mapped that to saying that if a requirement can be defined in a test, that's better than in a README ;-)

@kytrinyx
Copy link
Member Author

kytrinyx commented Apr 7, 2016

It's true. I wish I had gotten y'alls point of view before I submitted all those issues!

kytrinyx added a commit that referenced this issue Apr 7, 2016
There is no biological basis for RNA->DNA transcription.

A couple of people suggested that we could rephrase this and make the
question _Given this RNA, what DNA sequence could produce it?_, but
it seemed simpler to simply remove it.

There is also the question of whether or not having the reverse
transcription helps guide people to a better/more generic solution.
I think that this is one place where we could have the discussion on the
website about hard-coding vs not.

See #148
@rebelwarrior
Copy link

BTW I may be remembering this wrong, I was a Bio major (15 years ago things change) but I think transcription does go both ways. DNA produces RNA in humans, but I think some other organisms inject RNA into DNA by reverse transcribing it, Viruses and possibly Bacteria. I believe that is how gene therapy works: using a virus vector to inject DNA and I think it goes via RNA. But I'll double check.

In any case the exercise from a programming stand point works as a encode/decode and I with chez wicker it is valid even if biologically not right.

@kytrinyx
Copy link
Member Author

Yeah, there are viruses (reverse transcriptase for instance), but it's a bit of an edge case. I would rather stick with the basics and if we introduce the inverse, use the what DNA sequence would produce this argument. (I was a genetics major, but it's also been over a decade since I spent any time thinking about this).

@rebelwarrior
Copy link

Ha ha I just go confirmation that retro-viruses use it. I forgot you were a genetics major.

@rpottsoh
Copy link
Member

rpottsoh commented Jan 20, 2018

It appears to me that the only thing keeping this issue open is exercism/plsql#11, that issue could probably just be closed. Also it seems to me the choice to test RNA -> DNA conversion should maybe be a track level decision.

@kytrinyx
Copy link
Member Author

I'll close this one. The PL/SQL issue doesn't seem resolved, even if it sounds like they decided to override the description.

@martinfreedman
Copy link

martinfreedman commented Jan 21, 2018

WTF! https://en.wikipedia.org/wiki/Reverse_transcriptase! This not only has a biological basis, but removing it ruins, AFAICS, the whole point of this exercise which becomes completely trivial. Even if there were not a biological argument, that could be noted and still be part of the exercise.

@kytrinyx
Copy link
Member Author

@martinfreedman Fair point.

This exercise has been a bit of a mess, because when I implemented it, I was thinking only of the basic RNA transcription process. I think there are two interesting things that have come out of this discussion:

  1. The more interesting exercise goes both ways
  2. As described in the current (or any of the previous) descriptions, there's a lot of room for confusion.

My recommendation here would be to deprecate the existing exercise and create a new one that goes both ways, framing the problem without so much room for confusion. This would mean a better name and a better description based on the new framing.

@kytrinyx kytrinyx reopened this Jan 25, 2018
@martinfreedman
Copy link

martinfreedman commented Jan 25, 2018

@kytrinyx I am up for doing that.

Already rewrote the c# version (tests etc.) for myself.

I would call it "nucleotide-transcription"

Need to clone the problem-specifications git and make sure I can build out correctly into the csharp exercise and compare to my "original".

Then I could PR either separately or all together metadata.yml, description.md and canonical-data.json

Is this the right way to go about doing this?

UPDATE: Read up on contributing.md so ignore this last question

@rpottsoh
Copy link
Member

@kytrinyx I just to make sure you are aware of the comments that were made in #1078 starting here that I think tie into this issue.

@martinfreedman
Copy link

One question on the canonical-data-schema

How do I re-use variables within different tests? e.g in c#/xunit I have:-

 const string _dna = "GATGGAACTTGACTACGTAAATT";
 const string _rna = "CTUCCTTGUUCTGUTGCUTTTUU";
...
[Fact]
 public void Rna_complement()
 {
     Assert.Equal(_rna, Transcription.ToRna(_dna));
 }

 [Fact]
 public void Rna_to_dna_to_rna_double_transcription_produces_original_rna_strand()
 {
     Assert.Equal(_rna, Transcription.ToRna(Transcription.ToDna(_rna)));
 }
...

@coriolinus
Copy link
Member

coriolinus commented Jan 25, 2018 via email

@rpottsoh
Copy link
Member

rpottsoh commented Jan 25, 2018

JSON doesn't "run". Take a look at the file canonical-data.json for rna-transcription. It is up to the track maintainers to write the tests for their language based on what the JSON file describes. Some tracks have developed programs to generate test suites from the JSON while other tracks generate their test suites by hand from the JSON. The canonical-data simply describes what each test is testing, what property it is testing, what the input might be, and what the expected output might be. I use the word might because it is possible sometimes for test cases to not have input or an expected output. However, place holders for input and expected are always included in the JSON.

I see that @coriolinus summed it up nicely.

@martinfreedman
Copy link

Understood.

So shall I proceed @kytrinyx. Create a new issue first, or continue here?

@rpottsoh
Copy link
Member

rpottsoh commented Jan 25, 2018 via email

@kytrinyx
Copy link
Member Author

I am up for doing that.
Excellent! Let's discuss the details and hash out a name there (at first blush "nucleotide-transcription" sounds good, but I'd love to have some space to actually discuss it and get input from others).

just to make sure you are aware of the comments that were made in #1078

@rpottsoh Ah, yepp. I'll deal with that there. I don't mind that we've defined it differently at all.

Create a new issue first, or continue here?

@martinfreedman Go ahead and start clean in a new issue. I'm going to re-close this one.

Thanks everyone for your patience and input in getting this all optimized for both clarity and learning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests