In this notebook, I will show how to create a custom FASTA and GTF file, if you have chimeric/humanized gene.

References I used for this tutorial.

**Mouse**:

- GTF link: [Gencode_vM32](https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_mouse/release_M32/gencode.vM32.chr_patch_hapl_scaff.annotation.gtf.gz)
- Fasta link: [mm39](https://hgdownload.soe.ucsc.edu/goldenPath/mm39/bigZips/mm39.fa.gz)

**Human**:

- GTF link: [Gencode_v29](https://www.encodeproject.org/files/gencode.v29.primary_assembly.annotation_UCSC_names/@@download/gencode.v29.primary_assembly.annotation_UCSC_names.gtf.gz)
- Fasta link: [GRCh38](https://www.encodeproject.org/files/GRCh38_no_alt_analysis_set_GCA_000001405.15/@@download/GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta.gz)

## Load library

In [None]:
import AGEpy as age
import pandas as pd

## Create GTF file

you need to extract the gene and transcript information for the given gene from the human GTF file, create a new chromosome, and add this information to the mouse GTF.

In [None]:
GTF = age.readGTF("gencode.vM32.chr_patch_hapl_scaff.annotation.gtf")
GTF = age.parseGTF(GTF)

human_gtf = age.readGTF("gencode.v46.chr_patch_hapl_scaff.annotation.gtf")
human_gtf = age.parseGTF(human_gtf)

APOE_gtf = human_gtf[human_gtf.gene_name == "APOE"]
APOE_gtf = APOE_gtf[APOE_gtf.seqname == "chr19"]
APOE_gtf.loc[:, 'seqname'] = "chr20"
#APOE_gtf = APOE_gtf.with_columns(pl.col("seqname").cast(pl.Categorical))

MAPT_gtf = human_gtf[human_gtf.gene_name == "MAPT"]
MAPT_gtf = human_gtf[human_gtf.seqname == "chr17"]
MAPT_gtf.loc[:, 'seqname'] = "chr21"
#MAPT_gtf = MAPT_gtf.with_columns(pl.col("seqname").cast(pl.Categorical))

mad1_gtf = pd.concat([GTF, APOE_gtf, MAPT_gtf], ignore_index=True)

age.writeGTF(mad1_gtf, "MAD1.gtf")

## Create FASTA file

you need to extract the sequence information for entire chromosome of the give gene from the human FASTA file, change the chromosome to match what you put for the GTF file.

Once you have the custom FASTA file(s), you should concate them with the mouse FASTA file.

```
grep "chr19" hg38.fa > APOE.fa
grep "chr17" hg38.fa > MAPT.fa

sed -i 's/chr19/chr20/' APOE.fa
sed -i 's/chr17/chr21/' APOE.fa

cat mm39.fa APOE.fa MAPT.fa > MAD1.fa
```