Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StreamModeError #3057

Closed
ghost opened this issue Apr 1, 2021 · 6 comments
Closed

StreamModeError #3057

ghost opened this issue Apr 1, 2021 · 6 comments
Labels
type:bug Something isn't working

Comments

@ghost
Copy link

ghost commented Apr 1, 2021

Im trying to use streamlit in biopython and its giving me the below error:
what my be the problem?
error

StreamModeError: Fasta files must be opened in text mode.
Traceback:
File "c:\users\akim nyoni\anaconda3\lib\site-packages\streamlit\script_runner.py", line 333, in run_script
exec(code, module.dict)
File "C:\Users\AKIM NYONI\Desktop\Xul\SCI4101 Bioinformatcs\app\BioInfo-App\App.py", line 38, in
main()
File "C:\Users\AKIM NYONI\Desktop\Xul\SCI4101 Bioinformatcs\app\BioInfo-App\App.py", line 28, in main
dna_record = SeqIO.read( seq_file ,"fasta")
File "c:\users\akim nyoni\anaconda3\lib\site-packages\Bio\SeqIO_init
.py", line 654, in read
iterator = parse(handle, format, alphabet)
File "c:\users\akim nyoni\anaconda3\lib\site-packages\Bio\SeqIO_init_.py", line 607, in parse
return iterator_generator(handle)
File "c:\users\akim nyoni\anaconda3\lib\site-packages\Bio\SeqIO\FastaIO.py", line 183, in init
super().init(source, mode="t", fmt="Fasta")
File "c:\users\akim nyoni\anaconda3\lib\site-packages\Bio\SeqIO\Interfaces.py", line 52, in init
raise StreamModeError(

@ghost ghost added type:bug Something isn't working status:needs-triage Has not been triaged by the Streamlit team labels Apr 1, 2021
@kmcgrady
Copy link
Collaborator

kmcgrady commented Apr 8, 2021

Hi @akim-nyoni Can you provide the code with what you do between the file uploader and the line that causes the error? My guess is the file uploads correctly, but the input to biopython is incorrect.

Just to be clear, the output from the FileUploader is an instance of BytesIO. You might need to convert it to a string This StackOverflow answer might help.

# assume bytes_io is a `BytesIO` object
byte_str = bytes_io.read()

# Convert to a "unicode" object
text_obj = byte_str.decode('UTF-8')  # Or use the encoding you expect

@kmcgrady kmcgrady added status:awaiting-user-response Issue requires clarification from submitter and removed status:needs-triage Has not been triaged by the Streamlit team labels Apr 8, 2021
@ghost
Copy link
Author

ghost commented Apr 10, 2021

import streamlit as st
import matplotlib.pyplot as plt
import matplotlib
matplotlib.use("Agg")
from Bio.Seq import Seq
from Bio import SeqIO
from collections import Counter
import neatbio.sequtils as utils
import numpy as np
from PIL import Image

def main():

st.title("Protein Synthesis App")

activity = ['Welcome','DNA Sequence','About']
choice = st.sidebar.selectbox("Select Activity",activity)
if choice == "Welcome":
    st.subheader("Welcome To Our Protein Synthesis App\n This App Is For : Transcription and Translation Of DNA")
elif choice == "DNA Sequence":
    st.subheader("DNA Sequence Analysis")
    seq_file = st.file_uploader("Upload FASTA File",type=["fasta","fa"])
    if seq_file is not None:
        dna_record = SeqIO.read(seq_file,"fasta")
        dna_seq = dna_record.seq

        details = st.radio("Details",("Description","Sequence"))
        if details == "Description":
            st.write(dna_record.description)
        elif details == "Sequence":
            st.write(dna_record.seq)


        # Protein Synthesis
        st.subheader("Protein Synthesis")
        if st.checkbox("Transcription"):
            st.write(dna_seq.transcribe())

        elif st.checkbox("Translation"):
            st.write(dna_seq.translate())

        
elif choice == "About":
    st.subheader("About Us")

if name == 'main':
main()

the for transcription and translation of DNA

@kmcgrady
Copy link
Collaborator

Thank you for the code @akim-nyoni . This is definitely a challenge of converting the output of file uploader, that is BytesIO. to a Text stream (likely StringIO). Here’s an example code from the same StackOverflow answer. Hopefully this would resolve the issue.

import io
# assume bytes_io is a `BytesIO` object
byte_str = seq_file.read()

# Convert to a "unicode" string object
text_obj = byte_str.decode('UTF-8')  # Or use the encoding you expect

# convert to io.StringIO, which I think SeqIO will understand well.
SeqIO.read(io.StringIO(text_obj))

Since this is not issue with Streamlit, I am going to close this issue. Feel free to let us know if you get any more errors, but I recommend using our forums. It’s monitored by multiple people, many of which may be familiar with biopython and how to make it work with Streamlit. Here’s a discussion conversation on the same conversion from BytesIO to a string.

@ghost
Copy link
Author

ghost commented Apr 10, 2021

Thank you, it worked

@arighosh1
Copy link

import streamlit as st
import neatbio.sequtils as utils
from Bio.Seq import Seq
from Bio import SeqIO
from collections import Counter
import io

data pkgs

import matplotlib.pyplot as plt
import matplotlib
matplotlib.use("Agg")
import numpy as np
st.set_option('deprecation.showfileUploaderEncoding', False)

def main():

st.title('Simple Bioinformatics App')
menu = ["DNA Sequence","Dot Plot"]
choice = st.sidebar.selectbox("Select Activity",menu)

assume bytes_io is a BytesIO object

byte_str = seq_file.read()

Convert to a "unicode" string object

text_obj = byte_str.decode('UTF-8') # Or use the encoding you expect

convert to io.StringIO, which I think SeqIO will understand well.

SeqIO.read(io.StringIO(text_obj))

if choice == "DNA Sequence":
    st.subheader("DNA Sequence Analysis")
    seq_file = st.file_uploader("Upload FASTA File",type = ["fasta","fa"])
    #text_io = io.TextIOWrapper(seq_file)



    if seq_file is not None:
        dna_record = SeqIO.read(seq_file,"fasta")
        #st.write(dna_record)
        dna_seq = dna_record.seq
        desc = dna_record.description
        details = st.radio("Details", ("Description", "Sequence"))
        if details == "Description":
             st.write(desc)
        elif details == "Sequence":
             st.write(dna_seq)

        # Nucleotide Frequencies
        st.subheader("Nucleotide Frequencies")
        dna_freq = Counter(dna_seq)
        st.write(dna_freq)
        adenine_color = st.beta_color_picker("Adenine Color")
        guanine_color = st.beta_color_picker("Guanine Color")
        thymine_color = st.beta_color_picker("Thymine Color")
        cytosil_color = st.beta_color_picker("Cytosil Color")



        if st.button("Plot Freq"):
            barlist = plt.bar(dna_freq.keys(), dna_freq.values())
            barlist[0].set_color(adenine_color)
            barlist[1].set_color(thymine_color)
            barlist[2].set_color(guanine_color)
            barlist[3].set_color(cytosil_color)

            st.pyplot()


        st.subheader("DNA Composition")
        gc_score = utils.gc_content(str(dna_seq))
        at_score = utils.at_content(str(dna_seq))
        st.json({"GC Content ":gc_score, "AT Content ":at_score})

        # Nucleotide Count
        nt_count = st.text_input("Enter Nucleotide Here","Type Nucleotide Alphabet")
        st.write("Number of {} nucleotide is : {} ".format((nt_count),str(dna_seq).count(nt_count)))

        # Protein Synthesis
        p1 = dna_seq.translate()
        aa_freq = Counter(str(p1))

        st.subheader("Protein Synthesis")
        if st.checkbox("Transcription"):
            st.write(dna_seq.transcribe())
        elif st.checkbox("Translation"):
            st.write(dna_seq.translate())
        elif st.checkbox("Complement"):
            st.write(dna_seq.complement())
        elif st.checkbox("AA Frequency"):
            st.write(aa_freq)
        elif st.checkbox("AA Plot Frequency"):
            #aa_color = st.beta_color_picker("Amino Acid Color")
            #barlist = plt.bar(aa_freq.keys(), aa_freq.values(), color = aa_color)
            plt.bar(aa_freq.keys(), aa_freq.values())
            #barlist[0].set_color(aa_color)
            st.pyplot()
        elif st.checkbox("Full Amino Acid Name"):
            aa_name = str(p1).replace("*","")
            st.write(aa_name)
            st.write("--------------------------")
            st.write(utils.convert_1to3(aa_name))



elif choice == "Dot Plot":
    st.subheader("Generate Dot Plot For Two Sequences")
    seq_file = st.file_uploader("Upload 1st FASTA File", type=["fasta", "fa"])
    seq_file2 = st.file_uploader("Upload 2nd FASTA File", type=["fasta", "fa"])

    # text_io = io.TextIOWrapper(seq_file)

    if seq_file and seq_file2 is not None:
        dna_record1 = SeqIO.read(seq_file, "fasta")
        dna_record2 = SeqIO.read(seq_file2, "fasta")

        # st.write(dna_record)
        dna_seq1 = dna_record1.seq
        dna_seq2 = dna_record2.seq

        desc1 = dna_record1.description
        desc2 = dna_record2.description

        details = st.radio("Details", ("Description", "Sequence"))
        if details == "Description":
            st.write(desc1)
            st.write("----------")
            st.write(desc2)
        elif details == "Sequence":
            st.write(dna_seq1)
            st.write("----------")
            st.write(dna_seq2)

        custom_limit = st.number_input("Select max number of Nucleotide ",10,200,25)
        if st.button("Dot Plot"):
            st.write("Comparing the first {} Nucleotide of Two Sequences ".format(custom_limit))
            dotplotx(dna_seq1[0:custom_limit], dna_seq2[0:custom_limit])
            st.pyplot()

def delta(x,y):
return 0 if x == y else 1

def M(seq1,seq2,i,j,k):
return sum(delta(x,y) for x,y in zip(seq1[i:i+k], seq2[j:j+k]))

def makeMatrix(seq1,seq2,k):
n = len(seq1)
m = len(seq2)
return [[M(seq1,seq2,i,j,k) for j in range(m-k+1)] for i in range(n-k+1)]

def plotMatrix(M, t, seq1, seq2, nonblank = chr(0x25A0), blank = ' '):
print(' |' + seq2)
print('-'*(2 + len(seq2)))
for label,row in zip(seq1, M):
line = ''.join(nonblank if s < t else blank for s in row)
print(label + '|' + line )

def dotplot(seq1, seq2, k=1, t=1):
M = makeMatrix(seq1,seq2,k)
plotMatrix(M,t,seq1,seq2)

# Convert to Fxn

def dotplotx(seq1,seq2):
plt.imshow(np.array(makeMatrix(seq1,seq2,1)))
# on x axis list all sequnces of seq2
xt = plt.xticks(np.arange(len(list(seq2))), list(seq2))
# on y axis list all sequnces of seq1
yt = plt.yticks(np.arange(len(list(seq1))), list(seq1))
plt.show()

if name == 'main':
main()

I am new here. Where did I do mistake? I think I placed some parts in wrong place.

@kmcgrady
Copy link
Collaborator

kmcgrady commented Jun 9, 2021

Hi there @arighosh1 , thanks for sharing your question! Unfortunately we can't always dig deep and troubleshoot challenges like this on GitHub. We typically use GitHub for feature requests and bug reports.

For troubleshooting help, you can visit our awesome community forums, where thousands of other Streamlit hackers help each other and talk about the amazing things they are creating!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants