# Prerequisite Setup
First ensure you have TAP running and the IP of the instance of TAP.

Also ensure you have fetched the schema.


In [1]:
!pip install 'tapclipy>=0.1.8'
from tapclipy import tap_connect
import json

# Create TAP Connection
tap = tap_connect.Connect('http://tap.hi2lab.io')
tap.fetch_schema()
print(tap.url())

http://tap.hi2lab.io/graphql


# Clean

Clean is a query that will clean and format the text depending on which parameters you pass.
There are 5 current parameters you can pass.

- visible = Replaces all white spaces with dots and new lines with line feeds.
- minimal = Removes all extra white spaces and extra new lines, leaving only one of each.
- simple = Removes all extra white spaces and extra new lines, leaving only one of each. It will also replace hypens and quotes with their ascii safe equivalents.
- preserve = This will replace spaces with dots and preserve the length of the text.
- ascii = This will replace all non ascii characters eg any char above 127

See below for examples and descriptions.

## Visible
Replaces all white spaces with dots and new lines with line feeds.

### Example:

In [8]:
# Set our query type to clean
query = tap.query('clean')

# Set our parameter to visible
params = '''{ "cleanType":"visible" }'''

# pass in some test data
string = "This will replace spaces with dots and \n newlines with line feeds"

# query the api
strResult = tap.analyse_text(query, string, params)

# Print Result
print("-" * 40)
print("Visible Clean:")
print("-" * 40)
print("Input Text: \n\n", string)
print("\n")
print("Result: \n\n", strResult["data"]["clean"]["analytics"])

----------------------------------------
Visible Clean:
----------------------------------------
Input Text: 

 This will replace spaces with dots and 
 newlines with line feeds


Result: 

 This·will·replace·spaces·with·dots·and·¬·newlines·with·line·feeds


## Minimal
Removes all extra white spaces and extra new lines, leaving only one of each.

#### Example:

In [10]:
# Set our query type to clean
query = tap.query('clean')

# Set our parameter to minimal
params = '''{ "cleanType":"minimal" }'''

# pass in some test data
string = "This will remove extra      spaces and \n \n \n extra new lines"

# query the api
strResult = tap.analyse_text(query, string, params)

# Print Result
print("-" * 40)
print("Minimal Clean:")
print("-" * 40)
print("Input Text: \n\n", string)
print("\n")
print("Result: \n\n", strResult["data"]["clean"]["analytics"])

----------------------------------------
Minimal Clean:
----------------------------------------
Input Text: 

 This will remove extra      spaces and 
 
 
 extra new lines


Result: 

 This will remove extra spaces and
extra new lines


## Simple
Removes all extra white spaces and extra new lines, leaving only one of each.
It will also replace hypens and quotes with their ascii safe equivalents.

#### Example:

In [15]:
# Set our query type to clean
query = tap.query('clean')

# Set our parameter to simple
params = '''{ "cleanType":"simple" }'''

# pass in some test data
string = "This will remove extra      spaces and \n \n \n extra new lines and replace “ with \" "

# query the api
strResult = tap.analyse_text(query, string, params)

# Print Result
print("-" * 40)
print("Simple Clean:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Result:\n\n", strResult["data"]["clean"]["analytics"])

----------------------------------------
Simple Clean:
----------------------------------------
Input Text:

 This will remove extra      spaces and 
 
 
 extra new lines and replace “ with " 


Result:

 This will remove extra spaces and
extra new lines and replace " with " 


## Preserve
This will replace spaces with dots and preserve the length of the text.

#### Example:

In [20]:
# Set our query type to clean
query = tap.query('clean')

# Set our parameter to preserve
params = '''{ "cleanType":"preserve" }'''

# pass in some test data
string = "This will replace tabs, non breaking spaces and new lines \n \n with standard spaces and linefeeds."

# query the api
strResult = tap.analyse_text(query, string, params)

# Print Result
print("-" * 40)
print("Preserve Clean:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Result:\n\n", strResult["data"]["clean"]["analytics"])

----------------------------------------
Preserve Clean:
----------------------------------------
Input Text:

 This will replace tabs, non breaking spaces and new lines 
 
 with standard spaces and linefeeds.


Result:

 This will replace tabs, non breaking spaces and new lines 
 
 with standard spaces and linefeeds.


## Ascii
This will replace all non ascii characters eg any char above 127

#### Example:

In [22]:
# Set our query type to clean
query = tap.query('clean')

# Set our parameter to ascii
params = '''{ "cleanType":"ascii" }'''

# pass in some test data
string = "This ¡ will ¢ replace £ any ¤ non ascii characters"

# query the api
strResult = tap.analyse_text(query, string, params)

# Print Result
print("-" * 40)
print("Ascii Clean:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Result:\n\n", strResult["data"]["clean"]["analytics"])

----------------------------------------
Ascii Clean:
----------------------------------------
Input Text:

 This ¡ will ¢ replace £ any ¤ non ascii characters


Result:

 This  will  replace  any  non ascii characters
