## A survey of iodine transporters, for identification of candidates for recombination into a microbe.

Procedure:
- scrape human, mouse [Na-I Sympoters](https://en.wikipedia.org/wiki/Sodium/iodide_cotransporter)
- verify that they have conserved sequences
- scrape for other human, mouse [solute carrier membrane transport proteins](https://en.wikipedia.org/wiki/Solute_carrier_family). Do it to get a sense of what the active domains might be that make the Na-I symporter specific for iodine.
- loop:
  - identify peptide sequences of interest.
  - search databases for other matching domains.
  - if results are satisfactory and the crucial domains are clear, exit loop.
  - else, expand the search to other species up the phylogenetic tree (moving in the direction of microbes and kelp) and look at other proteins across life that are specific to iodine, by doing literature research.
  - identify candidate species proteins, and find sequencing data.
- end with a protein of interest that is a likely candidate for being a pump that's viable for 

General links:
- [Capnocytophaga canimorsus](https://en.wikipedia.org/wiki/Capnocytophaga_canimorsus)
- [Na-I Symporter](https://en.wikipedia.org/wiki/Sodium/iodide_cotransporter)
- [Solute Carrier Family](https://en.wikipedia.org/wiki/Solute_carrier_family)
- [Iodine in biology](https://en.wikipedia.org/wiki/Iodine_in_biology)
- [Iodine](https://en.wikipedia.org/wiki/Iodine#Biological_role)
- Nitrate from [Nitratine](https://en.wikipedia.org/wiki/Nitratine) "Nitratine ... has been known since 1845 from mineral deposits in the Confidence Hills, Southern Death Valley, California and the Atacama Desert, Chile. It is still used in organic farming (where Haber–Bosch ammonia is forbidden) in the US, but prohibited in international organic agriculture.[9]"
- Fertilizer constituents [(src)](https://en.wikipedia.org/wiki/Fertilizer): Fertilizers typically provide, in varying proportions:[35] Three main macronutrients (NPK):
        Nitrogen (N): leaf growth and stems;[36]
        Phosphorus (P): development of roots, flowers, seeds and fruit;[37]
        Potassium (K): strong stem growth, movement of water in plants, promotion of flowering and fruiting;[38]
    Three secondary macronutrients: calcium (Ca), magnesium (Mg), and sulfur (S);[39]
    Micronutrients: copper (Cu), iron (Fe), manganese (Mn), molybdenum (Mo), zinc (Zn), and boron (B).[40] Of occasional significance are silicon (Si), cobalt (Co), and vanadium (V)
- [Eutrophication](https://en.wikipedia.org/wiki/Eutrophication)
- Productive Species: Laminaria digitata , Laminaria hyperborea

In [40]:
# Run this code inside a Jupyter cell. If all three lines point to your .venv folder in WSL, your setup is perfect:
# In this project (iodine transporter survey), the venv should be found in ../comparative-sequence-analysis, 
# should we need to get it manually.

import sys, os
print(f"1. Executable: {sys.executable}") 
print(f"2. Version: {sys.version}")
print(f"3. Env Path: {os.getenv('VIRTUAL_ENV')}")

1. Executable: /home/morgan/projects/bioinfo-projects/comparative-sequence-analysis/.venv/bin/python
2. Version: 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0]
3. Env Path: /home/morgan/projects/bioinfo-projects/comparative-sequence-analysis/.venv


### SLC5A5 / NIS / TDH1 - solute carrier family 5 member 5 - gene id 6528

https://www.ncbi.nlm.nih.gov/datasets/gene/6528/
- RefSeq summary:     This gene encodes a member of the sodium glucose cotransporter family. The encoded protein is responsible for the uptake of iodine in tissues such as the thyroid and lactating breast tissue. The iodine taken up by the thyroid is incorporated into the metabolic regulators triiodothyronine (T3) and tetraiodothyronine (T4). Mutations in this gene are associated with thyroid dyshormonogenesis 1.[provided by RefSeq, Sep 2009]
- 8 proteins, 8 transcipts, downloaded as ncbi_dataset(1).zip https://www.ncbi.nlm.nih.gov/datasets/gene/6528/#transcripts-and-proteins

https://www.ncbi.nlm.nih.gov/gene?cmd=retrieve&dopt=default&rn=1&list_uids=6528

https://www.ncbi.nlm.nih.gov/protein/KAI4041341.1 (partial) sequence.gp
https://www.ncbi.nlm.nih.gov/protein/KAI2589675.1 (partial?) sequence(1).gp

https://www.uniprot.org/uniprot/Q92911
[this chart here](https://www.uniprot.org/uniprotkb/Q92911/variant-viewer) is an excellent way to identify regions of interest, especially after matching it to conserved regions, and/or seeing if this region appears in other distant species. [this](https://www.uniprot.org/uniprotkb?query=iodine+NOT+%28taxonomy_id%3A33208%29) and [this](https://www.uniprot.org/uniprotkb?query=iodine+AND+transporter+NOT+%28taxonomy_id%3A33208%29) searches for not metazoa && iodine (&& transporter).

About medical: Preedy V (2009). Comprehensive Handbook of Iodine Nutritional, Biochemical, Pathological and Therapeutic Aspects. Burlington: Elsevier. p. 616. ISBN 978-0-08-092086-3. [link](https://books.google.com/books?id=7v7g5XoCQQwC&pg=PA616). [src wiki hypothyroidism](https://en.wikipedia.org/wiki/Hypothyroidism#Prevention)

In [None]:
from Bio import SeqIO

sequences =[]
for record in SeqIO.parse('data/SLC5A5-h-sapiens/ncbi_dataset/data/protein.faa', "fasta"):
    sequences += [record]
for record in SeqIO.parse('data/SLC5A5-h-sapiens/sequence.fasta', 'fasta'):
    sequences += [record]
for record in SeqIO.parse('data/SLC5A5-h-sapiens/sequence(1).fasta', 'fasta'):
    sequences += [record]


[SeqRecord(seq=Seq('MEAVETGERPTFGAWDYGVFALMLLVSTGIGLWVGLARGGQRSAEDFFTGGRRL...TNL'), id='NP_000444.1', name='NP_000444.1', description='NP_000444.1 SLC5A5 [organism=Homo sapiens] [GeneID=6528] [isoform=1]', dbxrefs=[]),
 SeqRecord(seq=Seq('MCLGQLLNSVLTALLFMPVFYRLGLTSTYEYLEMRFSRAVRLCGTLQYIVATML...TNL'), id='NP_001427636.1', name='NP_001427636.1', description='NP_001427636.1 SLC5A5 [organism=Homo sapiens] [GeneID=6528] [isoform=2]', dbxrefs=[]),
 SeqRecord(seq=Seq('MEAVETGERPTFGAWDYGVFALMLLVSTGIGLWVGLARGGQRSAEDFFTGGRRL...TNL'), id='XP_011526494.1', name='XP_011526494.1', description='XP_011526494.1 SLC5A5 [organism=Homo sapiens] [GeneID=6528] [isoform=X1]', dbxrefs=[]),
 SeqRecord(seq=Seq('MCLGQLLNSVLTALLFMPVFYRLGLTSTYEYLEMRFSRAVRLCGTLQYIVATML...TNL'), id='XP_011526495.1', name='XP_011526495.1', description='XP_011526495.1 SLC5A5 [organism=Homo sapiens] [GeneID=6528] [isoform=X2]', dbxrefs=[]),
 SeqRecord(seq=Seq('MRFSRAVRLCGTLQYIVATMLYTGIVIYAPALILNQVTGLDIWASLLSTGIICT...TNL'), id='XP_0115

In [111]:

from Bio import Align
import numpy as np
arr = np.ones([len(sequences),len(sequences)])*np.inf

aligner = Align.PairwiseAligner(mode='local') # local is better than global or fogsaa in this task of long sequences.
all_aligns = {}
for si1 in range(len(sequences)-1):
  for si2 in range(si1, len(sequences)):
    alignments = aligner.align(sequences[si1], sequences[si2])
    # print('='*80)
    # print(si1,si2)
    # print(alignments)
    # print(alignments[0])
    all_aligns[(si1,si2)] = alignments[0]
    arr[si1,si2] = alignments[0].score
arr

array([[643., 554., 632., 543., 510., 632., 543., 510., 643., 643.],
       [ inf, 554., 543., 543., 510., 543., 543., 510., 554., 554.],
       [ inf,  inf, 654., 565., 532., 654., 565., 532., 632., 632.],
       [ inf,  inf,  inf, 565., 532., 565., 565., 532., 543., 543.],
       [ inf,  inf,  inf,  inf, 532., 532., 532., 532., 510., 510.],
       [ inf,  inf,  inf,  inf,  inf, 654., 565., 532., 632., 632.],
       [ inf,  inf,  inf,  inf,  inf,  inf, 565., 532., 543., 543.],
       [ inf,  inf,  inf,  inf,  inf,  inf,  inf, 532., 510., 510.],
       [ inf,  inf,  inf,  inf,  inf,  inf,  inf,  inf, 643., 643.],
       [ inf,  inf,  inf,  inf,  inf,  inf,  inf,  inf,  inf,  inf]])

In [112]:
print(arr-np.min(arr))

[[133.  44. 122.  33.   0. 122.  33.   0. 133. 133.]
 [ inf  44.  33.  33.   0.  33.  33.   0.  44.  44.]
 [ inf  inf 144.  55.  22. 144.  55.  22. 122. 122.]
 [ inf  inf  inf  55.  22.  55.  55.  22.  33.  33.]
 [ inf  inf  inf  inf  22.  22.  22.  22.   0.   0.]
 [ inf  inf  inf  inf  inf 144.  55.  22. 122. 122.]
 [ inf  inf  inf  inf  inf  inf  55.  22.  33.  33.]
 [ inf  inf  inf  inf  inf  inf  inf  22.   0.   0.]
 [ inf  inf  inf  inf  inf  inf  inf  inf 133. 133.]
 [ inf  inf  inf  inf  inf  inf  inf  inf  inf  inf]]


By this analysis, it seems small chunks are excised, which might even be a facet of sequencing. There aren't small mutations that can show differences in function -- which makes sense among individuals in the same species. Todo, look at mouse and other species data.

##  Reviewing possible Iodine transporters
Surveying the basic options for iodine transport seems like a good first step.
This list begins to compile the ways organisms transport iodine across membranes. 
For now, I'll exclude iodine metabolism processing, but that'll likely be important:
- A microbe that is engineered to intake a vast concentration of iodine will likely be poisoned by it
- Therefore, some means of sequestering the iodine is expected to be necessary.
- Two options that come to mind are:
  1) attaching it to free tyrosine and stacking that tyrosine either as a oligotyrosine peptide or free-floating iodo-tyrosines
  2) put the iodide ions directly into an inclusion body
  The first method suggests engineering metabolic enzymes. The second suggests vacuole transporters or working with organisms that isolate salt into deposits. These aren't the main focus, but I may as well include them at the bottom of the list. 

Main two categories of ways to get iodine into protists:
### Iodine transporters 
1) Na-I symporter, human mammal. SLC5A5
1) Na-Br symporter, human

### (vanadium-dependent halo)peroxidases.
1) vanadium-dependent iodine peroxidase (vIPO), laminaria kelp.
2) (vHPO) vanadium-dependent haloperoxidase, diatom
3) vHPO, murex snail, sea urchin, horseradish, etc: see web-browsing-infodumps.md [tag514432] for more species

### Other proteins and mechanisms of interest:
1) MCT (monocarboxylate transporter) SLC16A2, human mammal. (This transports thyroid hormones, compounds built on iodinated tyrosine; thyroid hormones cannot cross plasma membranes on their own). (from [wiki](https://en.wikipedia.org/wiki/Monocarboxylate_transporter_8), see also zebrafish and frogs  Xenopus laevis and Xenopus tropicalis.)
2) iodine transferases
3) Pendrin, human mammal.
4) transporter of thyroxine-binding globulin, human mammal
5) transporter of transthyretin, thyroxine-binding albumin, human mammal. (Lower binding strength than thyroxine-binding globulin; [src](https://en.wikipedia.org/wiki/Transthyretin)
6) Thyroglobulin (Tg), storage protein of iodine and inactive thyroid hormone and catalyzes their bonding
   - thyroglobulin analogues in brown algae, protists?
   - see also: [Phenol Red](https://en.wikipedia.org/wiki/Phenol_red) for another potential 'load'. It also serves well as a development visual indicator: ie, say, with algae infused with iodine and phenol red, they are likely to bind in some capacity and you can visually determine if one or the other is adhering/uptaking. See also [Bromophenol blue](https://en.wikipedia.org/wiki/Bromophenol_blue) or other bromophenols.
   - about loads: [src](https://en.wikipedia.org/wiki/Iodine_in_biology#Non-animal_functions) "It is common across all domains of life and uses tyrosine bonded to iodine.[23] " and  "Plants, insects, zooplankton, and algae store iodine as mono-iodotyrosine (MIT), di-iodotyrosine (DIT), iodocarbons, or iodoproteins.[24][25][26] " [24-26]: see web-browsing-infodumps tag232119.
   
7) Deiodinase, selenium-dependent, human
8) proteins in the endostyles of lampreys, amphioxi, ascidia, tunicate, Lancelet. see <feb-10-workcomp>, & maybe [tag514432]
9) human peroxiredoxin genes: PRDX1, ... PRDX6.   
10) ai google for "halo transferase":  see below [tagHaloTag]
11) thyroid peroxidase. [src](https://en.wikipedia.org/wiki/Thyroid_hormones):
    - This iodide enters the follicular lumen from the cytoplasm by the transporter pendrin, in a purportedly passive manner.
    - In the colloid, iodide (I−) is oxidized to iodine (I0) by an enzyme called **thyroid peroxidase**.
    - Iodine (I0) is very reactive and iodinates the thyroglobulin at tyrosyl residues in its protein chain (in total containing approximately 120 tyrosyl residues). 
12) https://en.wikipedia.org/wiki/Recombinant_human_parathyroid_hormone
13) https://en.wikipedia.org/wiki/Thyroxine_5-deiodinase, esp @ releasing the iodine in response to stress
14) [Thyroid peroxidase](https://en.wikipedia.org/wiki/Thyroid_peroxidase), human: I- -> I0 for addition to tyrosine, encoded by TPO gene. aka thyroperoxidase (TPO), thyroid specific peroxidase or iodide peroxidase.
15) [Megalin](https://en.wikipedia.org/wiki/Megalin) aka Low density lipoprotein receptor-related protein 2 also known as LRP-2 , LRP-2 gene. "LRP2 is expressed in epithelial cells of the thyroid (thyrocytes), where it can serve as a receptor for the protein thyroglobulin (Tg)."  [src](https://en.wikipedia.org/wiki/Thyroid_hormones): "Iodinated thyroglobulin binds megalin for endocytosis back into the cell."
16) meta: filter proteins by their [protein targeting](https://en.wikipedia.org/wiki/Protein_targeting) sequences, to see where they go. ie, in algae, look at proteins tend to find their way to near the membrane. 
17) meta: find a thyroglobulin alternative by seeing what if anything in algae has lots of tyr sequences and no other apparent function.
18) leads from workcomp//data/survey-iodine-portuguese.pdf,  : "Sodium/iodide co-transporter (NIS), Anoctamin-1 (ANO1), Pendrin (PDS), and the Cystic fibrosis transmembrane conductance regulator (CFTR). ... Putative ortholog sequences were detected in plants [model species: Arabidopsis thaliana] for all these reconstructions: urea transporters DUR3 (NIS), sulphate transporters SULTR (Pendrin), anoctamin-like proteins (ANO1), ABC-C subfamily proteins (CFTR)". citation: Coimbra, Teresa Alexandra Vidal Pinheiro. "Bioinformatic strategies to explore iodine transport in plants and its potential application in biofortification." Master's thesis, Universidade do Minho (Portugal), 2022.
19) [Halide methyltransferases](https://en.wikipedia.org/wiki/Methyl_halide_transferase). In ~paper~ they know that plants use it to yeet iodine out of the organism.  Paper: workcomp//data/Metabolic-engineering-iodine-content-Arabidopsis.pdf
20) [Anoctamin-1](https://en.wikipedia.org/wiki/ANO1). "additionally Anoctamin-1 is apical iodide channel." Aka Transmembrane member 16A (TMEM16A). https://www.uniprot.org/uniprotkb/Q5XXA6/entry
21) [CFTR](https://en.wikipedia.org/wiki/Cystic_fibrosis_transmembrane_conductance_regulator) via transport of thiocyanate. ABC proteins: atp-binding cassette transporters. implicated in cystic fibrosis. ie, wiki/ABCC7. ABCC6 is also relevant, but 1-5 and 8-12(13) are not our route.  
    - See also: ABCD1, which transfers fatty acids into peroxisomes.
    - See also: Vasiliou, V; Vasiliou, K; Nebert, DW (April 2009). "Human ATP-binding cassette (ABC) transporter family". Human Genomics. 3 (3): 281–90. doi:10.1186/1479-7364-3-3-281. PMC 2752038. PMID 19403462.
22) [Tyrosinase](https://en.wikipedia.org/wiki/Tyrosinase), TYR gene.


### Organisms
- bacillus subtilis
- homo sapiens
- laminara digitalis
- Xenopus laevis [tag707153] 

### Bioinformatics analogue search fronts
- peroxidase (vanadium and otherwise)
- halotransporters
- thyroglobulin 
- dehalogenase 

-- ----

procedure 1: vHPO + cytoplasmic thyroglobulin/etc
1) use vanadium-dependent haloperoxidase to deionize iodine so it passes into the organism, as atom or molecule
2) this reactive I0 species will tend to react with whatever's near, nonspecifically. Provide it thyroglobulin, or other tyrosine-rich peptide source, that can soak up reactive iodine, ideally near outer membrane.
- more native, but localization required (see screenshot s7642)

procedure 2: Na/I symporter + TPO + thyroglobulin
1) use human Na/I symporter SLC5A5 (human hNIS) to transport iodide ions
2) oxidize the ions into I0 at the right location with TPO, possibly
3) let them react to tyrosine with thyroglobulin
- less native, but more directed

procedure 3: import iodine and add to biochemically polymerized tyrosine
1) import with hNIS, peroxidase, ABC.
2) in parallel, polymerize tyrosine/derivatives
2) in parallel, let iodine attach to tyrosine freely or catalyzed by modified tyrosinase
- fresh idea, maybe it won't stand the 'sleep on it' test.

extras:
1) eliminate any destination metabolic processes that utilize iodinated tyrosyls.
2) if need be, provide an internal vesicle to pump the I0 into via pendrin or Na/I symporter.
3) for extraction, consider looking into what laminara/etc uses to release the I in response to stress. perhaps the right antigens can cause it. 
4) read workcomp//data/Metabolic-engineering-iodine-content-Arabidopsis.pdf

-- ----


----

Tests:

- It seems uncertain whether the extracellular vIPO actually oxidizes I- into radicals for import or if it radicalizes the near extracellular neighborhood to harm pathogens. While evidence is in about vIPO mechanisms and kelp's high iodine content and the theoretical protective potential of iodine, vIPO's increased expression in kelp in response to a stress events don't clearly indicate that this is why it's stored in high concentrations. Perhaps, it just keeps a massive store so that the normal efflux gradient can be radicalized with a single protein, making a normal I- efflux into a protective I0 efflux?
- Question: is vIPO used to import iodine as nonpolar I2 or is it used to weaponize the organism's iodine ammo stock? 
- A test: add and subtract vIPO from choanoflagellates (who can survive concentrated iodine) and from laminara. Observe whether internally held iodine content changes.

Tags

### [tagHaloTag]

A halotransferase, often referred to in the context of HaloTag technology, is a modified haloalkane dehalogenase enzyme designed to covalently and irreversibly bind to specific chloroalkane-linked ligands. It serves as a versatile, self-labeling protein tag used for molecular imaging, protein purification, and immobilization, primarily enabling cellular, subcellular, and, in some cases, in-soil visualization of proteins. [1, 2, 3]  
Key Aspects of HaloTag/Halotransferase Technology: 

• Mechanism: The engineered enzyme (HaloTag protein) reacts rapidly under physiological conditions with synthetic ligands, forming a covalent bond with a chloroalkane linker. 
• Applications: It allows for tagging proteins with various fluorescent dyes, affinity handles (like biotin), or solid supports. 
• Utility: Used extensively in, for example, this study (https://pubs.acs.org/doi/10.1021/cb800025k) for imaging NF-κB cellular translocation and studying protein-protein or protein-DNA complexes. 
• Versatility: A single Genetic fusion allows for different functional ligands to be applied, note this patent (https://patents.google.com/patent/EP3891226A2/en). 
• Distinction: It is distinct from metabolic enzymes that break down anesthetics, such as those discussed in this NIH study (https://pubmed.ncbi.nlm.nih.gov/3364763/). [1, 2, 4, 5]  

It should not be confused with HALO-tag which is often used interchangeably in molecular biology. 

AI responses may include mistakes.

- [1] https://pubs.acs.org/doi/10.1021/cb800025k
- [2] https://pmc.ncbi.nlm.nih.gov/articles/PMC3480824/
- [3] https://pubs.acs.org/doi/10.1021/acs.est.6b01415
- [4] https://pubmed.ncbi.nlm.nih.gov/3364763/
- [5] https://patents.google.com/patent/EP3891226A2/en

see also:

 https://pubs.acs.org/doi/10.1021/acs.est.6b01415
 Volatile Gas Production by Methyl Halide Transferase: An In Situ Reporter Of Microbial Gene Expression In Soil
 2016,  42 cites,    Hsiao-Ying Cheng, Caroline A. Masiello, George N. Bennett, Jonathan J. Silberg

### [tag707153] 

src: https://en.wikipedia.org/wiki/Iodine_in_biology#Other_functions
 The frog species Xenopus laevis has proven to be an ideal model organism for experimental study of the mechanisms of apoptosis and the role of iodine in developmental biology.[13][1][14][15]
 - 1:  Venturi, Sebastiano (2011). "Evolutionary Significance of Iodine". Current Chemical Biology. 5 (3): 155–162. doi:10.2174/187231311796765012 (inactive 18 July 2025). ISSN 1872-3136. 
 - 13: Jewhurst K, Levin M, McLaughlin KA (2014). "Optogenetic Control of Apoptosis in Targeted Tissues of Xenopus laevis Embryos". J Cell Death. 7: 25–31. doi:10.4137/JCD.S18368. PMC 4213186. PMID 25374461.
 - 14: Venturi, S.; Venturi, M. (2014). "Iodine, PUFAs and Iodolipids in Health and Disease: An Evolutionary Perspective". Human Evolution. 29 (1–3): 185–205. ISSN 0393-9375.
   - abs: The structural, metabolic and synergic actions of iodine and polyunsaturated fatty acids (PUFAs) in life evolution and in the ‘membrane lipid language’ of cells are reviewed. Iodine is one of the most electron-rich atoms in the diet of marine and terrestrial organisms and, as iodide (I-), acts as an ances-tral electron-donor through peroxidase enzymes. It is the most primitive inorganic antioxidant in all iodide-concentrating cells, from primitive marine algae to more recent vertebrates. About 500 million years ago, the thyroid cells originated from the primitive gut of vertebrates, then migrated and specialized in the uptake and storage of iodocompounds in the thyroid, a new follicular organ. In parallel, ectodermic cells, differenti-ated into neuronal cells, became the primitive nervous system and brain. Both these cells synthesized iodolipids, as novel ‘words’ of the chemical ‘lipid language’ devoloped among cell membranes during the evolution of life, for better adaptation to terrestrial environments. The study of iodolipids is a new area of investigation, which might be useful for research on apoptosis, carcinogenesis and degenerative diseases, as well as for trying to understand some problems discussed regarding human evolution.
 - 15: Tamura K, Takayama S, Ishii T, Mawaribuchi S, Takamatsu N, Ito M (2015). "Apoptosis and differentiation of Xenopus tail-derived myoblasts by thyroid hormone". J Mol Endocrinol. 54 (3): 185–192. doi:10.1530/JME-14-0327. PMID 25791374.