# Application 1 -- Building an Interactive Dictionary

In [1]:
## Loading JSON file into python dictionary
import json
help(json)

Help on package json:

NAME
    json

DESCRIPTION
    JSON (JavaScript Object Notation) <http://json.org> is a subset of
    JavaScript syntax (ECMA-262 3rd edition) used as a lightweight data
    interchange format.
    
    :mod:`json` exposes an API familiar to users of the standard library
    :mod:`marshal` and :mod:`pickle` modules.  It is derived from a
    version of the externally maintained simplejson library.
    
    Encoding basic Python object hierarchies::
    
        >>> import json
        >>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
        '["foo", {"bar": ["baz", null, 1.0, 2]}]'
        >>> print(json.dumps("\"foo\bar"))
        "\"foo\bar"
        >>> print(json.dumps('\u1234'))
        "\u1234"
        >>> print(json.dumps('\\'))
        "\\"
        >>> print(json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True))
        {"a": 0, "b": 0, "c": 0}
        >>> from io import StringIO
        >>> io = StringIO()
        >>> json.dump(['streaming API'], io

In [2]:
data=json.load(open("data.json"))

In [3]:
type(data)

dict

In [4]:
print(data)  ### xd too big file

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



In [5]:
data["rain"]

['Precipitation in the form of liquid water drops with diameters greater than 0.5 millimetres.',
 'To fall from the clouds in drops of water.']

## The Program -- Build Up

### Simple dictionary output

In [6]:
import json

data=json.load(open("data.json"));

def translate(word):
    return data[word]

w=input("Enter word to be searched : ")

print(translate(w))

Enter word to be searched : rain
['Precipitation in the form of liquid water drops with diameters greater than 0.5 millimetres.', 'To fall from the clouds in drops of water.']


In [7]:
import json

data=json.load(open("data.json"));

def translate(word):
    return data[word]

w=input("Enter word to be searched : ")

print(translate(w))

##Error for undefined words

Enter word to be searched : dsf


KeyError: 'dsf'

### Consideration for non-existing words

In [8]:
import json

data=json.load(open("data.json"));

def translate(word):
    if word in data:
        return data[word]
    else:
        return "The word doesn't exist. Please double check it."

w=input("Enter word to be searched : ")

print(translate(w))

Enter word to be searched : sff
The word doesn't exist. Please double check it.


In [9]:
import json

data=json.load(open("data.json"));

def translate(word):
    if word in data:
        return data[word]
    else:
        return "The word doesn't exist. Please double check it."

w=input("Enter word to be searched : ")

print(translate(w))

##Even for Case-Sensitive words it shows Error.

Enter word to be searched : rAiN
The word doesn't exist. Please double check it.


### Making the program Letter Case-Insensitive

In [10]:
"Rain".lower()  ##conversion to lower case

'rain'

In [11]:
import json

data=json.load(open("data.json"));

def translate(word):
    word=word.lower();
    if word in data:
        return data[word]
    else:
        return "The word doesn't exist. Please double check it."

w=input("Enter word to be searched : ")

print(translate(w))

Enter word to be searched : raIN
['Precipitation in the form of liquid water drops with diameters greater than 0.5 millimetres.', 'To fall from the clouds in drops of water.']


In [12]:
import json

data=json.load(open("data.json"));

def translate(word):
    word=word.lower();
    if word in data:
        return data[word]
    else:
        return "The word doesn't exist. Please double check it."

w=input("Enter word to be searched : ")

print(translate(w))

## Input Error should also be reduced For eg: "Did you mean this" -- by Google

Enter word to be searched : rainn
The word doesn't exist. Please double check it.


### Calculating similarity ratio between two words

In [13]:
from difflib import SequenceMatcher  ##library for comparing text

In [14]:
SequenceMatcher(None,"rainn","rain")

<difflib.SequenceMatcher at 0x5982cb0>

In [15]:
SequenceMatcher(None,"rainn","rain").ratio()

0.8888888888888888

This states texts are similar.

### Finding best match of the word

In [16]:
from difflib import get_close_matches

In [17]:
help(get_close_matches)

Help on function get_close_matches in module difflib:

get_close_matches(word, possibilities, n=3, cutoff=0.6)
    Use SequenceMatcher to return list of the best "good enough" matches.
    
    word is a sequence for which close matches are desired (typically a
    string).
    
    possibilities is a list of sequences against which to match word
    (typically a list of strings).
    
    Optional arg n (default 3) is the maximum number of close matches to
    return.  n must be > 0.
    
    Optional arg cutoff (default 0.6) is a float in [0, 1].  Possibilities
    that don't score at least that similar to word are ignored.
    
    The best (no more than n) matches among the possibilities are returned
    in a list, sorted by similarity score, most similar first.
    
    >>> get_close_matches("appel", ["ape", "apple", "peach", "puppy"])
    ['apple', 'ape']
    >>> import keyword as _keyword
    >>> get_close_matches("wheel", _keyword.kwlist)
    ['while']
    >>> get_close_matches

In [18]:
get_close_matches("rainn",["help","pyramid","rain"])

['rain']

In [19]:
data.keys()



In [20]:
get_close_matches("rainn",data.keys())

['rain', 'train', 'rainy']

In [21]:
get_close_matches("rainn",data.keys(),n=5)

['rain', 'train', 'rainy', 'grain', 'drain']

In [22]:
## Considering only first letter
get_close_matches("rainn",data.keys())[0]

'rain'

### Making the program suggest a similar word

In [31]:
get_close_matches("cucco",data.keys())[0]

'cuckoo'

#### Setting the cutoff=0.8

In [42]:
import json
from difflib import get_close_matches

data=json.load(open("data.json"));

def translate(word):
    word=word.lower();
    if word in data:
        return data[word]
    elif len(get_close_matches(word,data.keys()))>0:
        return "Did you mean %s instead?" % get_close_matches(word,data.keys())[0]
    else:
        return "The word doesn't exist. Please double check it."

w=input("Enter word to be searched : ")

print(translate(w))

Enter word to be searched : cucco
Did you mean cuckoo instead?


In [44]:
import json
from difflib import get_close_matches

data=json.load(open("data.json"));

def translate(word):
    word=word.lower();
    if word in data:
        return data[word]
    elif len(get_close_matches(word,data.keys()))>0:
        return "Did you mean %s instead?" % get_close_matches(word,data.keys())[0]
    else:
        return "The word doesn't exist. Please double check it."

w=input("Enter word to be searched : ")

print(translate(w))

Enter word to be searched : afaaaaaattaaaa
The word doesn't exist. Please double check it.


Now, for the suggested word, again run the program if user agrees, and deny if user disagrees

### Prompting the user to confirm Similarity check

In [45]:
import json
from difflib import get_close_matches

data=json.load(open("data.json"));

def translate(word):
    word=word.lower();
    if word in data:
        return data[word]
    elif len(get_close_matches(word,data.keys()))>0:
        yn=input("Did you mean %s instead? Enter Y if yes, or N if no." % get_close_matches(word,data.keys())[0])
        if yn=="Y":
            return data[get_close_matches(word,data.keys())[0]]
    else:
        return "The word doesn't exist. Please double check it."

w=input("Enter word to be searched : ")

print(translate(w))

Enter word to be searched : rainn
Did you mean rain instead? Enter Y if yes, or N if no.Y
['Precipitation in the form of liquid water drops with diameters greater than 0.5 millimetres.', 'To fall from the clouds in drops of water.']


In [52]:
import json
from difflib import get_close_matches

data=json.load(open("data.json"));

def translate(word):
    word=word.lower();
    if word in data:
        return data[word]
    elif len(get_close_matches(word,data.keys()))>0:
        yn=input("Did you mean %s instead? Enter Y if yes, or N if no : " % get_close_matches(word,data.keys())[0])
        if yn=="Y":
            return data[get_close_matches(word,data.keys())[0]]
        elif yn=="N":
            return "The word doesn't exist. Please double check it."
        else:
            return "We didn't understand your entry."
            
    else:
        return "The word doesn't exist. Please double check it."

w=input("Enter word to be searched : ")

print(translate(w))

Enter word to be searched : rainn
Did you mean rain instead? Enter Y if yes, or N if no : N
The word doesn't exist. Please double check it.


### Creating more user-friendly output

In [57]:
import json
from difflib import get_close_matches

data=json.load(open("data.json"));

def translate(word):
    word=word.lower();
    if word in data:
        return data[word]
    elif len(get_close_matches(word,data.keys()))>0:
        yn=input("Did you mean %s instead? Enter Y if yes, or N if no : " % get_close_matches(word,data.keys())[0])
        if yn=="Y":
            return data[get_close_matches(word,data.keys())[0]]
        elif yn=="N":
            return "The word doesn't exist. Please double check it."
        else:
            return "We didn't understand your entry."
            
    else:
        return "The word doesn't exist. Please double check it."

w=input("Enter word to be searched : ")

output=translate(w)

for item in output:
    print(item)
    
### The output printed the string, considering it as list. Now, for multiple data type output we process our output as shown further...

Enter word to be searched : rainn
Did you mean rain instead? Enter Y if yes, or N if no : N
T
h
e
 
w
o
r
d
 
d
o
e
s
n
'
t
 
e
x
i
s
t
.
 
P
l
e
a
s
e
 
d
o
u
b
l
e
 
c
h
e
c
k
 
i
t
.


In [61]:
import json
from difflib import get_close_matches

data=json.load(open("data.json"));

def translate(word):
    word=word.lower();
    if word in data:
        return data[word]
    elif len(get_close_matches(word,data.keys()))>0:
        yn=input("Did you mean %s instead? Enter Y if yes, or N if no : " % get_close_matches(word,data.keys())[0])
        if yn=="Y":
            return data[get_close_matches(word,data.keys())[0]]
        elif yn=="N":
            return "The word doesn't exist. Please double check it."
        else:
            return "We didn't understand your entry."
            
    else:
        return "The word doesn't exist. Please double check it."

w=input("Enter word to be searched : ")

output=translate(w)

if type(output)==list:            ### Optimizing the output
    for item in output:
        print(item)
else:
    print(output)

Enter word to be searched : rainnn
Did you mean rain instead? Enter Y if yes, or N if no : N
The word doesn't exist. Please double check it.


## You can create a webpage with GUI for more interactive system