## Brief Overview

In [6]:
"""
Genome Sequence Similarity Analysis

Purpose
This script analyzes a given genome sequence and compares all possible substrings of a specified length against a reference gene. The goal is to determine the similarity percentage between each substring and the reference gene.

Detailed Description
1. Genome and Reference Gene Definition**: The script initializes a genome sequence as a string and a reference gene.
2. Length Calculation**: It determines the length of the reference gene.
3. Substring Iteration**: The script iterates through all substrings of the genome that match the length of the reference gene.
4. Similarity Calculation**:
   - It compares each substring with the reference gene character by character.
   - Counts the number of matching characters.
   - Computes the similarity percentage.
5. Output: The script prints the similarity percentage for each substring and a signature message.

This script is useful for genomic sequence analysis, particularly in identifying potential matches or conserved regions within a genome.
"""

'\nGenome Sequence Similarity Analysis\n\nPurpose\nThis script analyzes a given genome sequence and compares all possible substrings of a specified length against a reference gene. The goal is to determine the similarity percentage between each substring and the reference gene.\n\nDetailed Description\n1. Genome and Reference Gene Definition**: The script initializes a genome sequence as a string and a reference gene.\n2. Length Calculation**: It determines the length of the reference gene.\n3. Substring Iteration**: The script iterates through all substrings of the genome that match the length of the reference gene.\n4. Similarity Calculation**:\n   - It compares each substring with the reference gene character by character.\n   - Counts the number of matching characters.\n   - Computes the similarity percentage.\n5. Output: The script prints the similarity percentage for each substring and a signature message.\n\nThis script is useful for genomic sequence analysis, particularly in id

## Genome Sequence

In [7]:
# Define the genome sequence
genome = "ATGACGGGGAAAAATTTCCCCCCTGCTCA"

## Reference Gene

In [8]:
# Define the reference gene sequence
ref_gene = "ATGA"

## Length of Reference Gene

In [9]:
# Calculate the length of the reference gene
ref_gene_len = len(ref_gene)

## Executing Loop

In [10]:
# Loop through the genome to extract substrings of the same length as the reference gene
for i in range(len(genome) - ref_gene_len + 1):
    # Extract the substring from the genome
    temp_string = genome[i:i + ref_gene_len]
    
    # Initialize a counter to keep track of matching characters
    trig = sum(1 for j in range(ref_gene_len) if temp_string[j] == ref_gene[j])
    
    # Calculate the similarity percentage
    similarity = (trig / ref_gene_len) * 100.0
    
    # Print the similarity result
    print(f"The combination: {temp_string} is similar to {ref_gene} with {similarity:.2f}%")

The combination: ATGA is similar to ATGA with 100.00%
The combination: TGAC is similar to ATGA with 0.00%
The combination: GACG is similar to ATGA with 0.00%
The combination: ACGG is similar to ATGA with 50.00%
The combination: CGGG is similar to ATGA with 25.00%
The combination: GGGG is similar to ATGA with 25.00%
The combination: GGGA is similar to ATGA with 50.00%
The combination: GGAA is similar to ATGA with 25.00%
The combination: GAAA is similar to ATGA with 25.00%
The combination: AAAA is similar to ATGA with 50.00%
The combination: AAAA is similar to ATGA with 50.00%
The combination: AAAT is similar to ATGA with 25.00%
The combination: AATT is similar to ATGA with 25.00%
The combination: ATTT is similar to ATGA with 50.00%
The combination: TTTC is similar to ATGA with 25.00%
The combination: TTCC is similar to ATGA with 25.00%
The combination: TCCC is similar to ATGA with 0.00%
The combination: CCCC is similar to ATGA with 0.00%
The combination: CCCC is similar to ATGA with 0.0

## Author

In [12]:
# Print a signature message
print("\n---------------------------")
print("By: Sayed Muhammad Mehdi Shah")
print("Roll No: F21BIBBB1M06006")


---------------------------
By: Sayed Muhammad Mehdi Shah
Roll No: F21BIBBB1M06006
