# String Manipulation
Building on our work with strings, lists, and dictionaries, let's tackle another [Rosalind Problem](https://rosalind.info/problems/rna/) that focuses on string manipulation.

This problem introduces us to a fundamental concept in molecular biology: **transcription** - the process of creating RNA from a DNA template.

## String Methods Review

Before we dive into the problem, let's review some useful string methods:


In [1]:
my_string = "HELLO WORLD"
print(my_string.replace("L", "X"))  # Replace all 'L' with 'X'
print(my_string.lower())            # Convert to lowercase
print(my_string.upper())            # Convert to uppercase

HEXXO WORXD
hello world
HELLO WORLD


We can also build new strings character by character:

In [2]:
original = "ABC"
new_string = ""
for char in original:
    if char == "B":
        new_string += "X"  # Replace B with X
    else:
        new_string += char
print(new_string)  # "AXC"

AXC


### The Biology Behind It
In cells, DNA serves as the template for creating RNA through transcription:

DNA uses bases: A, T, G, C
RNA uses bases: A, U, G, C
The key difference: T in DNA becomes U in RNA

# Problem

**Given**: A DNA string t of length at most 1000 nt.
**Return**: The transcribed RNA string of t.
Let's look at the sample:

In [3]:
t = "GATGGAACTTGACTACGTAAATT"
expected = "GAUGGAACUUGACUACGUAAAUU"

Notice how every 'T' in the DNA string becomes 'U' in the RNA string, while A, G, and C remain unchanged.

This repository comes with a validator to check solutions. No peeking!

In [10]:
import os

if os.getcwd().endswith("notebooks"):
    os.chdir("..")

from src.rna import validator

validator(t, expected)

True

What can you do to process this input t and return the transcribed RNA?

In [11]:
# your code here
rna_result = "something"

validator(t, rna_result)  # Validate your answer

RNA string contains invalid character: s


False

<details><summary>Hint 1</summary>

Think about this step by step:

- Look at each character in the DNA string
- If it's a 'T', change it to 'U'
- If it's anything else (A, G, C), keep it the same
- Build up the RNA string character by character

</details>

Now, let's try on a larger dataset:

In [12]:
with open("data/rosalind_rna.txt", "r") as file:
    t = file.read().strip()  # Get a larger DNA sequence from a file

# Your code here

<details><summary>Hint 2</summary>
You could solve this with a loop:

```python
rna_result = ""
for nucleotide in t:
    if nucleotide == "T":
        rna_result += "U"
    else:
        rna_result += nucleotide
```

But there might be an even simpler way...
</details>

<details><summary>Hint 3</summary>
Python strings have a built-in method that can replace all occurrences of one character with another. Think about what method that might be!
</details>

One solution (there are many!):

<details><summary>Solution</summary>

```python
# Method 1: Using string replace (most efficient)
rna_result = t.replace("T", "U")

# Method 2: Using a loop (more explicit)
rna_result = ""
for nucleotide in t:
    if nucleotide == "T":
        rna_result += "U"
    else:
        rna_result += nucleotide

# Method 3: Using list comprehension and join
rna_result = "".join(["U" if nucleotide == "T" else nucleotide for nucleotide in t])
```
</details>

# Advanced

There are several other string methods and techniques that could help solve this problem efficiently.
<details><summary>Other methods</summary>

str.translate() with str.maketrans(): Create a translation table for character mapping
str.join(): Efficiently combine a list of characters into a string
List comprehensions: Create lists (or strings) in a concise way
map() function: Apply a function to each character

Example with translate:

```python
translation_table = str.maketrans("T", "U")
rna_result = t.translate(translation_table)
```

</details>
Try implementing the solution using a different method, or create your own validator function that checks if the transcription follows the biological rules correctly.

In [9]:
# Your alternative solution here

# Bonus: Create a reverse function that converts RNA back to DNA
def rna_to_dna(rna_string):
    # Your code here
    pass