# Julia programming exercises:
Exercises were retrieved from: <br>
https://exercism.org/tracks/julia/exercises/rna-transcription <br>
https://exercism.org/tracks/julia/exercises/hamming <br>
https://exercism.org/tracks/julia/exercises/nucleotide-count <br>
***

### RNA transcription:
You work for a bioengineering company that specializes in developing therapeutic solutions. <br>
Your team has just been given a new project to develop a targeted therapy for a rare type of cancer. <br>
Sometimes people's bodies produce too much of a given protein. That can cause all sorts of havoc. <br>
But if you can create a very specific molecule (called a micro-RNA), it can prevent the protein from being produced. <br>
This technique is called RNA Interference. <br>

### Instructions:
Your task is determine the RNA complement of a given DNA sequence. <br>
Both DNA and RNA strands are a sequence of nucleotides. <br>
The four nucleotides found in DNA are adenine (A), cytosine (C), guanine (G) and thymine (T). <br>
The four nucleotides found in RNA are adenine (A), cytosine (C), guanine (G) and uracil (U). <br>
Given a DNA strand, its transcribed RNA strand is formed by replacing each nucleotide with its complement: <br>
    G -> C <br>
    C -> G <br>
    T -> A <br>
    A -> U <br>

### Solution:

In [1]:
function complement(dna::AbstractString)
    transc = Dict('G' => 'C', 'C' => 'G', 'T' => 'A', 'A' => 'U')
    rna = ""
    for nucleotide in dna
        rna = rna * transc[nucleotide]
    end
    return rna
end

println(complement("AGCTATCGTAGGTCAGTAA"))

UCGAUAGCAUCCAGUCAUU


### Calculate the Hamming distance between two DNA strands
Your body is made up of cells that contain DNA. Those cells regularly wear out and need replacing, which they achieve by dividing into daughter cells. In fact, the average human body experiences about 10 quadrillion cell divisions in a lifetime! <br>

When cells divide, their DNA replicates too. Sometimes during this process mistakes happen and single pieces of DNA get encoded with the incorrect information. If we compare two strands of DNA and count the differences between them we can see how many mistakes occurred. This is known as the "Hamming Distance". <br>

We read DNA using the letters C,A,G and T. Two strands might look like this: <br>
GAGCCTACTAACGGGAT <br>
CATCGTAATGACGGCCT <br>
^ ^ ^  ^ ^    ^^ <br>
They have 7 differences, and therefore the Hamming Distance is 7.
### Implementation notes <br>
The Hamming distance is only defined for sequences of equal length, so an attempt to calculate it between sequences of different lengths should not work.

In [8]:
function hamming(dna1::AbstractString, dna2::AbstractString)
    if length(dna1)!=length(dna2)
        throw(ArgumentError("DNA sequences must be of equal length"))
    end
    sum(map(!=,dna1,dna2))
end

println(hamming("AGGGCTACTA","TGAGATACGT"))
#println(hamming("AGCTCGAGTCG","AGTCGATC")) #Uncomment to test sequences of different lengths

5


### Nucleotide count
Each of us inherits from our biological parents a set of chemical instructions known as DNA that influence how our bodies are constructed. All known life depends on DNA! <br>

DNA is a long chain of other chemicals and the most important are the four nucleotides, adenine, cytosine, guanine and thymine. A single DNA chain can contain billions of these four nucleotides and the order in which they occur is important! We call the order of these nucleotides in a bit of DNA a "DNA sequence". <br>

We represent a DNA sequence as an ordered collection of these four nucleotides and a common way to do that is with a string of characters such as "ATTACG" for a DNA sequence of 6 nucleotides. 'A' for adenine, 'C' for cytosine, 'G' for guanine, and 'T' for thymine. <br>

Given a string representing a DNA sequence, count how many of each nucleotide is present. If the string contains characters that aren't A, C, G, or T then it is invalid and you should signal an error. <br>

In [6]:
function countnucleotide(dna::AbstractString)
    count = Dict('A' => 0, 'C' => 0, 'G' => 0, 'T' => 0)
    for nucleotide in dna
        try
            count[nucleotide] += 1
        catch err
            throw(DomainError(nucleotide, "$(nucleotide) is not a nucleotide"))
        end
    end
    return count
end

println(countnucleotide("AGCTAAGAGTCA"))
#println(countnucleotide("ACGUCGAAG")) #Uncomment to test sequence with invalid nucleotide

Dict('A' => 5, 'G' => 3, 'T' => 2, 'C' => 2)
