# words and fish >< o>
I want to play with lots of fish names. There's a long list of fish names on wikipedia:
https://en.wikipedia.org/wiki/List_of_common_fish_names
I could copy and paste and hand-edit the list, or just do something quick that mostly works.



In [2]:
# grab the wiki article
import urllib

fishtml = urllib.urlopen('https://en.wikipedia.org/wiki/List_of_common_fish_names').readlines()

Wiki article source has lines that look like this:

`<li><a href="/wiki/Blobfish" title="Blobfish">Blobfish</a></li>`

It's faster to just look for those lines than to learn an html parser module.

In [14]:
import re
fishmatch = re.compile('<li><a href="/wiki/.*" title=".*">(.*)</a')
fishies = []

for line in fishtml:
    m = fishmatch.search(line)
    if m:
        fishies.append(m.group(1).lower())

In [15]:
# check the first ten fishies
fishies[0:10]

['aeneus corydoras',
 'african glass catfish',
 'african lungfish',
 'aholehole',
 'airbreathing catfish',
 'airsac catfish',
 'alaska blackfish',
 'albacore',
 'alewife',
 'alfonsino']

In [16]:
# check the last ten
fishies[-10:]

['aquarium fish',
 'blind fish',
 'fish families',
 'fish on stamps',
 'ichthyology terms',
 'large fish',
 'threatened rays',
 'threatened sharks',
 'prehistoric fish',
 'fish common names']

In [6]:
len(fishies)

1230

Some non-fish got mixed in there. Oh well.

In [11]:
# load the dictionary
with open('/usr/share/dict/words') as f:
    words = dict([(w.strip().lower(), True) for w in f.readlines()])

In [10]:
"Aholehole"[1:]

'holehole'

In [18]:
# words which become fishes when first letter is deleted?
for word in words:
    _ord = word[1:]
    if _ord.lower() in fishies:
        print word, _ord.lower()

troughy roughy
tangler angler
aide ide
stench tench
dangler angler
cling ling
stang tang
agar gar
boarfish oarfish
tide ide
yeel eel
devolution evolution
pray ray
feel eel
ride ide
vayu ayu
peel eel
hide ide
keel eel
jangler angler
mangler angler
spike pike
gray ray
jeel eel
bridgehead ridgehead
kling ling
broach roach
side ide
seel eel
delver elver
wide ide
cangler angler
achar char
froe roe
aperch perch
froughy roughy
bray ray
kalewife alewife
fide ide
teel eel
helver elver
scarp carp
reel eel
ekoi koi
mide ide
stope tope
nide ide
dray ray
shaddock haddock
heel eel
weel eel
revolution evolution
fling ling
fray ray
rangler angler
bide ide
lide ide
wangler angler
shake hake
sling ling
tray ray


In [20]:
word = "fishies"
i = 2
_ord = word[0:i] + word[i+1:]
_ord

'fihies'

In [21]:
# words which become fishes when any letter is deleted?
for word in words:
    for i in range(0,len(word)):
        _ord = word[0:i] + word[i+1:]
        if _ord.lower() in fishies:
            print word, _ord.lower()

bogan boga
chair char
catlap catla
rode roe
troughy roughy
tangler angler
inde ide
gair gar
aide ide
grunth grunt
bonce bone
lying ling
stench tench
dangler angler
shriner shiner
cling ling
doab dab
stang tang
gara gar
rome roe
gnar gar
raya ray
skater skate
ruddy rudd
ides ide
roed roe
roey roe
roer roe
tangs tang
agar gar
cichloid cichlid
milty milt
soler sole
soles sole
trench tench
discous discus
boarfish oarfish
boarfish barfish
boarfish boafish
koli koi
racy ray
solen sole
ruffed ruffe
gaur gar
javelina javelin
tide ide
fishling fishing
yeel eel
bases bass
bassa bass
socle sole
devolution evolution
trope tope
silversides silverside
bonze bone
rose roe
solea sole
shady shad
shade shad
pray ray
feel eel
borne bone
basso bass
spole sole
ride ide
vayu ayu
molka mola
tetrad tetra
boned bone
boney bone
boner bone
komi koi
dance dace
charr char
charr char
jacky jack
jacko jack
peel eel
mopla mola
kopi koi
toper tope
scant scat
sproat sprat
scamles scales
globy goby
hide ide
ruffle ruffe

In [12]:
# fishes which become words when first letter is deleted?
for fish in fishies:
    ish = fish[1:]
    if ish.lower() in words:
        print fish, ish.lower()

Aruana ruana
Bangus angus
Bass ass
Betta etta
Bleak leak
Blenny lenny
Boafish oafish
Boarfish oarfish
Bream ream
Brill rill
Brotula rotula
Buri uri
Chub hub
Cod od
Dab ab
Dace ace
Dory ory
Drum rum
Eel el
Flier lier
Flounder lounder
Gar ar
Grouper rouper
Grunt runt
Hake ake
Herring erring
Hoki oki
Ide de
Inanga nanga
Lagena agena
Ling ing
Mora ora
Nase ase
Opah pah
Oscar scar
Pike ike
Pollock ollock
Porgy orgy
Powen owen
Ray ay
Ricefish icefish
Salmon almon
Sauger auger
Scat cat
Scup cup
Shad had
Shark hark
Skate kate
Smelt melt
Snapper napper
Snook nook
Sole ole
Sprat prat
Stickleback tickleback
Swallower wallower
Sweeper weeper
Thornfish hornfish
Tope ope
Trout rout
Tuna una
Uaru aru
Wrasse rasse
Evolution volution
Bone one
Gill ill
Jaw aw
ganoine anoine
Pregnancy regnancy
Roe oe
Groundfish roundfish
salmon almon
tuna una
herring erring
cod od
pollock ollock


Too many proper names like Ollock and Almon in `/usr/share/dict/words` because it's meant for spell checking. Something like a scrabble dictionary would be nice. There's one here:

[`https://code.google.com/p/scrabblehelper/source/browse/trunk/ScrabbleHelper/src/dictionaries/`](https://code.google.com/p/scrabblehelper/source/browse/trunk/ScrabbleHelper/src/dictionaries/)


In [7]:
# load the scrabble dictionary
with open('/Users/ranjit/sowpods.txt') as f:
    sowpods = dict([(w.strip().lower(), True) for w in f.readlines()])

In [8]:
# fishes which become words when first letter is deleted?
for fish in fishies:
    ish = fish[1:]
    if ish.lower() in sowpods:
        print fish, ish.lower()

Aruana ruana
Ayu yu
Barb arb
Bass ass
Bleak leak
Boafish oafish
Boarfish oarfish
Bream ream
Brill rill
Brotula rotula
Chub hub
Cobia obia
Cod od
Dab ab
Dace ace
Drum rum
Eel el
Flier lier
Flounder lounder
Gar ar
Grayling rayling
Grunt runt
Hake ake
Herring erring
Ide de
Koi oi
Mora ora
Opah pah
Oscar scar
Panga anga
Porgy orgy
Ray ay
Sauger auger
Scat cat
Scup cup
Shad had
Shark hark
Smelt melt
Snapper napper
Snook nook
Sole ole
Sprat prat
Swallower wallower
Sweeper weeper
Tope ope
Trout rout
Wrasse rasse
Evolution volution
Bone one
Fins ins
Gill ill
Jaw aw
Meristics eristics
Pregnancy regnancy
Roe oe
Spawning pawning
sharks harks
herring erring
cod od
rays ays


A lot of fish have common words in their names. Find them...

In [26]:
wordfish = []

for word in sowpods:
    for fish in fishies:
        if word.lower() in fish.lower():
            wordfish.append(fish)
 
# let's see the first hundred
", ".join(wordfish[:100])

'Fire goby, Fire bar danio, Firefish, Siamese fighting fish, Fierasfer, Genetically modified, Black scalyfin, Bluefin tuna, Bowfin, Coffinfish, Finback cat shark, Fingerfish, Flagfin, Longfin, Longfin dragonfish, Longfin escolar, Longfin smelt, Long-finned char, Long-finned pike, Redfin perch, Sailfin silverside, Spinyfin, Splitfin, Threadfin, Threadfin bream, Triplefin blenny, Yellow-and-black triplefin, Yellowfin croaker, Yellowfin cutthroat trout, Yellowfin grouper, Yellowfin pike, Yellowfin surgeonfish, Yellowfin tuna, Fins, dorsal fin, Fin and flipper locomotion, Filefish, Filter feeders, Diseases and parasites, Pumpkinseed, Bristlemouth, Bristlenose catfish, Mouthbrooder, Mouthbrooder, Genetically modified, Coffinfish, Gopher rockfish, Steelhead, Chain pickerel, Fish on stamps, Death Valley pupfish, Desert pupfish, Owens pupfish, Pupfish, Blue-redstripe danio, Orangestriped triggerfish, Striped bass, Striped burrfish, False moray, Mora, Moray eel, Remora, Yellow-edged moray, Yell

This might be more fun if organized better:

In [27]:
longwords = [word for word in sowpods if len(word)>5 and not word in fishies]


for fish in fishies:
    fishwords = []
    for word in longwords:
        if word.lower() in fish.lower():
            fishwords.append(word)
    print fish+": ", ", ".join(fishwords)

Aeneus corydoras:  aeneus
African glass catfish:  catfish
African lungfish:  lungfish
Aholehole:  
Airbreathing catfish:  catfish, breathing, breath
Airsac catfish:  catfish
Alaska blackfish:  blackfish, alaska
Albacore:  albacore
Alewife:  alewife
Alfonsino:  
Algae eater:  
Alligatorfish:  alligator
Alligator gar:  alligator
American sole:  
Amur pike:  
Anchovy:  
Anemonefish:  anemone
Angelfish:  elfish, angelfish
Angler:  angler
Angler catfish:  catfish, angler
Anglerfish:  angler, anglerfish
Antarctic cod:  arctic, antarctic
Antarctic icefish:  arctic, antarctic
Antenna codlet:  antenna
Arapaima:  arapaima
Archerfish:  archerfish, archer
Arctic char:  arctic
Armored gurnard:  armored, gurnard
Armored searobin:  searobin, armored
Armorhead:  
Armorhead catfish:  catfish
Armoured catfish:  catfish, armour, armoured
Arowana:  
Arrowtooth eel:  
Aruana:  
Asian carps:  
Asiatic glassfish:  
Atka mackerel:  
Atlantic cod:  
Atlantic herring:  erring
Atlantic salmon:  
Atlantic saury: 

Oops, because fishies are capitalized, couldn't filter out actual fish from the dictionary. Try again!

In [29]:
lowercasefishies = [word.lower() for word in fishies]

longwords = [word for word in sowpods if len(word)>5 and not word.lower() in lowercasefishies]

for fish in fishies:
    fishwords = []
    for word in longwords:
        if word.lower() in fish.lower():
            fishwords.append(word)
    print fish+": ", ", ".join(fishwords)

Aeneus corydoras:  aeneus
African glass catfish:  
African lungfish:  
Aholehole:  
Airbreathing catfish:  breathing, breath
Airsac catfish:  
Alaska blackfish:  alaska
Albacore:  
Alewife:  
Alfonsino:  
Algae eater:  
Alligatorfish:  alligator
Alligator gar:  alligator
American sole:  
Amur pike:  
Anchovy:  
Anemonefish:  anemone
Angelfish:  elfish
Angler:  
Angler catfish:  
Anglerfish:  
Antarctic cod:  arctic, antarctic
Antarctic icefish:  arctic, antarctic
Antenna codlet:  antenna
Arapaima:  
Archerfish:  archer
Arctic char:  arctic
Armored gurnard:  armored
Armored searobin:  armored
Armorhead:  
Armorhead catfish:  
Armoured catfish:  armour, armoured
Arowana:  
Arrowtooth eel:  
Aruana:  
Asian carps:  
Asiatic glassfish:  
Atka mackerel:  
Atlantic cod:  
Atlantic herring:  erring
Atlantic salmon:  
Atlantic saury:  
Atlantic silverside:  silver, silvers
Australasian salmon:  austral
Australian grayling:  austral, rayling
Australian herring:  austral, erring
Australian lungf

KeyboardInterrupt: 

Oops again - shouldn't show fish who has no words

In [30]:
for fish in fishies:
    fishwords = []
    for word in longwords:
        if word.lower() in fish.lower():
            fishwords.append(word)
    if len(fishwords) > 0:
        print fish+": ", ", ".join(fishwords)

Aeneus corydoras:  aeneus
Airbreathing catfish:  breathing, breath
Alaska blackfish:  alaska
Alligatorfish:  alligator
Alligator gar:  alligator
Anemonefish:  anemone
Angelfish:  elfish
Antarctic cod:  arctic, antarctic
Antarctic icefish:  arctic, antarctic
Antenna codlet:  antenna
Archerfish:  archer
Arctic char:  arctic
Armored gurnard:  armored
Armored searobin:  armored
Armoured catfish:  armour, armoured
Atlantic herring:  erring
Atlantic silverside:  silver, silvers
Australasian salmon:  austral
Australian grayling:  austral, rayling
Australian herring:  austral, erring
Australian lungfish:  austral
Australian prowfish:  austral
Ballan wrasse:  ballan
Bamboo shark:  bamboo
Banded killifish:  banded
Barbeled dragonfish:  dragon
Barbeled houndshark:  hounds
Barred danio:  barred
Barreleye:  barrel
Basking shark:  asking, basking
Beaked salmon:  beaked
Beaked sandfish:  beaked
Beluga sturgeon:  beluga
Bicolor goat fish:  bicolor
Bigeye squaretail:  retail, square
Bighead carp:  bigh