### Tap Queries Availible as of 02.01.19

This is a brief look at the queries that are currently availible and how to use them.


## Setup
First let's just get setup.


In [19]:
!pip install 'tapclipy>=0.1.8'
from tapclipy import tap_connect
import json

# Create TAP Connection
tap = tap_connect.Connect('http://192.168.99.102:9000')
tap.fetch_schema()
print(tap.url())

http://192.168.99.102:9000/graphql


## Clean

Clean is a query that will clean and format the text depending on which parameters you pass.
There are 5 current parameters you can pass.

- visible = Replaces all white spaces with dots and new lines with line feeds.
- minimal = Removes all extra white spaces and extra new lines, leaving only one of each.
- simple = Removes all extra white spaces and extra new lines, leaving only one of each. It will also replace hypens and quotes with their ascii safe equivalents.
- preserve = This will replace spaces with dots and preserve the length of the text.
- ascii = This will replace all non ascii characters eg any char above 127.

### Example:

In [20]:
# Set our query type to clean
query = tap.query('clean')

# Set our parameter to visible
params = '''{ "cleanType":"visible" }'''

# pass in some test data
string = "This will replace spaces with dots and \n newlines with line feeds"

# query the api
strResult = tap.analyse_text(query, string, params)

# Print Result
print("-" * 40)
print("Visible Clean:")
print("-" * 40)
print("Input Text: \n\n", string)
print("\n")
print("Result: \n\n", strResult["data"]["clean"]["analytics"])
print("\n")
print("-" * 40)
print("RAW JSON RESULT")
print("-" * 40)
print(json.dumps(strResult, indent=2))


----------------------------------------
Visible Clean:
----------------------------------------
Input Text: 

 This will replace spaces with dots and 
 newlines with line feeds


Result: 

 This·will·replace·spaces·with·dots·and·¬·newlines·with·line·feeds


----------------------------------------
RAW JSON RESULT
----------------------------------------
{
  "data": {
    "clean": {
      "analytics": "This\u00b7will\u00b7replace\u00b7spaces\u00b7with\u00b7dots\u00b7and\u00b7\u00ac\u00b7newlines\u00b7with\u00b7line\u00b7feeds",
      "querytime": 97,
      "message": "",
      "timestamp": "2019-01-03T00:37:25.011056Z"
    }
  }
}


## Annotations

Annotation is a query that will splitup the text into json data, including seperating the sentences into their own array and providing various stats on each word. 

The stats provided for each word:

- lemma = provides the intended meaning of the word based on it's inflection You can find out more about Lemmatisation [here](https://en.wikipedia.org/wiki/Lemmatisation)
- parent = returns the word this word is dependant on
- pos tag = returns the part of speech tag for this word, learn more [here](https://nlp.stanford.edu/software/tagger.shtml)
- child = returns the word that is dependant on this word
- dep type = returns the dependency type, learn more [here](https://nlp.stanford.edu/software/dependencies_manual.pdf)
- ner tag = returns the named entity recognized if any. learn more [here](https://nlp.stanford.edu/software/CRF-NER.shtml)

This query can provide different outcomes based on the pipeline type passed.

possible pipelines:

- clu = returns the lemma, pos tag and ner tag
- standard = returns the lemma, pos tag, parent, children and dep type
- fast = returns the lemma and pos tag
- ner = returns the lemma, pos tag, parent, children, dep type and ner tag.

### Example:

In [44]:
# Set our query type to annotations
query = tap.query('annotations')

# Set our pipeline parameter to clu, standard, fast or ner
params = '''{ "pipeType":"ner" }'''

# pass in some test data
string = "This is the first sentence. This is the second sentence."

# query the api
strResult = tap.analyse_text(query, string, params)

# Print Result
print("-" * 40)
print("Annotations:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Raw Result:\n\n")
print(json.dumps(strResult, indent=2))

----------------------------------------
Annotations:
----------------------------------------
Input Text:

 This is the first sentence. This is the second sentence.


Raw Result:


{
  "data": {
    "annotations": {
      "analytics": [
        {
          "idx": 0,
          "start": 0,
          "end": 6,
          "length": 6,
          "tokens": [
            {
              "idx": 0,
              "term": "This",
              "lemma": "this",
              "postag": "DT",
              "parent": 1,
              "children": [],
              "deptype": "nsubj",
              "nertag": "O"
            },
            {
              "idx": 1,
              "term": "is",
              "lemma": "be",
              "postag": "VBZ",
              "parent": -1,
              "children": [
                0,
                4,
                5
              ],
              "deptype": "root",
              "nertag": "O"
            },
            {
              "idx": 2,
             

## Expressions

Expressions ia a query that will extract the epistemic expressions of a sentence and list each sentence in it's own array.

### Example:

In [43]:
# Set our query type to expressions
query = tap.query('expressions')

# no params needed

# pass in some test data
string = "This is the first great happy blue angry cold sentence. This is the second fantastic sentence."

# query the api
strResult = tap.analyse_text(query, string)

# Print Result
print("-" * 40)
print("Expressions:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Raw Result:\n\n")
print(json.dumps(strResult, indent=2))

----------------------------------------
Expressions:
----------------------------------------
Input Text:

 This is the first great happy blue angry cold sentence. This is the second fantastic sentence.


Raw Result:


{
  "data": {
    "expressions": {
      "analytics": [
        {
          "sentIdx": 0,
          "affect": [
            {
              "text": "first"
            },
            {
              "text": "great"
            },
            {
              "text": "happy"
            },
            {
              "text": "blue"
            },
            {
              "text": "angry"
            }
          ],
          "epistemic": [],
          "modal": []
        },
        {
          "sentIdx": 1,
          "affect": [
            {
              "text": "fantastic"
            }
          ],
          "epistemic": [],
          "modal": []
        }
      ]
    }
  }
}


## Syllables

Syllables is a query that will return the syllable count for each word in a sentence and group each sentence into it's own array.

### Example:

In [42]:
# Set our query type to syllables
query = tap.query('syllables')

# no params needed

# pass in some test data
string = "This is the first great happy blue angry cold sentence. This is the second fantastic sentence."

# query the api
strResult = tap.analyse_text(query, string)

# Print Result
print("-" * 40)
print("Syllables:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Raw Result:\n\n")
print(json.dumps(strResult, indent=2))

----------------------------------------
Syllables:
----------------------------------------
Input Text:

 This is the first great happy blue angry cold sentence. This is the second fantastic sentence.


Raw Result:


{
  "data": {
    "syllables": {
      "analytics": [
        {
          "sentIdx": 0,
          "avgSyllables": 1.1818181818181819,
          "counts": [
            1,
            1,
            1,
            1,
            1,
            2,
            1,
            2,
            1,
            2
          ]
        },
        {
          "sentIdx": 1,
          "avgSyllables": 1.4285714285714286,
          "counts": [
            1,
            1,
            1,
            2,
            3,
            2
          ]
        }
      ],
      "timestamp": "2019-01-03T01:47:47.632905Z"
    }
  }
}


## Spelling

Spelling is a query that will return the spelling mistakes and possible suggestions for what the intended word was.

### Example:

In [52]:
# Set our query type to Spelling
query = tap.query('spelling')

# no params needed

# pass in some test data
string = "Th is the frst graet hpapy blue anrgy cold sentence. This is the second fantastic sentence."

# query the api
strResult = tap.analyse_text(query, string)

# Print Result
print("-" * 40)
print("Spelling:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Raw Result:\n\n")
print(json.dumps(strResult, indent=2))

----------------------------------------
Spelling:
----------------------------------------
Input Text:

 Th is the frst graet hpapy blue anrgy cold sentence. This is the second fantastic sentence.


Raw Result:


{
  "data": {
    "spelling": {
      "timestamp": "2019-01-03T01:56:44.311809Z",
      "message": "",
      "querytime": 13,
      "analytics": [
        {
          "sentIdx": 0,
          "spelling": []
        },
        {
          "sentIdx": 1,
          "spelling": []
        }
      ]
    }
  }
}


## Vocabulary

Vocabulary is a query that returns the stats on the vocabulary used, It groups them by unique words and how many times they were used.

### Example:

In [55]:
# Set our query type to Vocabulary
query = tap.query('vocabulary')

# no params needed

# pass in some test data
string = "This is the first great happy blue angry cold sentence. This is the second fantastic sentence."

# query the api
strResult = tap.analyse_text(query, string)

# Print Result
print("-" * 40)
print("Vocabulary:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Raw Result:\n\n")
print(json.dumps(strResult, indent=2))

----------------------------------------
Vocabulary:
----------------------------------------
Input Text:

 This is the first great happy blue angry cold sentence. This is the second fantastic sentence.


Raw Result:


{
  "data": {
    "vocabulary": {
      "analytics": {
        "unique": 13,
        "terms": [
          {
            "term": "this",
            "count": 2
          },
          {
            "term": "is",
            "count": 2
          },
          {
            "term": ".",
            "count": 2
          },
          {
            "term": "fantastic",
            "count": 1
          },
          {
            "term": "blue",
            "count": 1
          },
          {
            "term": "angry",
            "count": 1
          },
          {
            "term": "second",
            "count": 1
          },
          {
            "term": "cold",
            "count": 1
          },
          {
            "term": "happy",
            "count": 1
          

## Metrics

Metrics is a query that will return various stats on the text that was parsed.
Metrics such as:

- word count
- sentence count
- average word counts
- array of sentences and word counts per sentence


### Example:

In [56]:
# Set our query type to Metrics
query = tap.query('metrics')

# no params needed

# pass in some test data
string = "This is the first great happy blue angry cold sentence. This is the second fantastic sentence."

# query the api
strResult = tap.analyse_text(query, string)

# Print Result
print("-" * 40)
print("Metrics:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Raw Result:\n\n")
print(json.dumps(strResult, indent=2))

----------------------------------------
Vocabulary:
----------------------------------------
Input Text:

 This is the first great happy blue angry cold sentence. This is the second fantastic sentence.


Raw Result:


{
  "data": {
    "metrics": {
      "analytics": {
        "words": 16,
        "sentences": 2,
        "sentWordCounts": [
          10,
          6
        ],
        "averageSentWordCount": 8
      },
      "querytime": 13,
      "message": "",
      "timestamp": "2019-01-03T02:33:33.229210Z"
    }
  }
}


## Pos Stats (Part Of Speech)

Part of speech stats is a query that will return the verb, noun and adjective distribution ratios of the sentences.


### Example:

In [57]:
# Set our query type to posStats
query = tap.query('posStats')

# no params needed

# pass in some test data
string = "This is the first great happy blue angry cold sentence. This is the second fantastic sentence."

# query the api
strResult = tap.analyse_text(query, string)

# Print Result
print("-" * 40)
print("Pos Stats:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Raw Result:\n\n")
print(json.dumps(strResult, indent=2))

----------------------------------------
Pos Stats:
----------------------------------------
Input Text:

 This is the first great happy blue angry cold sentence. This is the second fantastic sentence.


Raw Result:


{
  "data": {
    "posStats": {
      "analytics": {
        "verbNounRatio": 1,
        "futurePastRatio": 0,
        "adjectiveWordRatio": 0.5,
        "namedEntityWordRatio": 1.125,
        "nounDistribution": [
          0.5,
          0.5
        ],
        "verbDistribution": [
          0.5,
          0.5
        ],
        "adjectiveDistribution": [
          0.75,
          0.25
        ]
      }
    }
  }
}


## Reflect Expressions

Reflect Expressions is a query that will return various stats about the text such as:

- word counts
- average word length
- sentence counts
- average sentence lengths
- meta tags used such as knowledge, experience or regulation
- phrase tags used such as outcome, temporal, pertains, consider, anticipate ..etc

### Example:

In [61]:
# Set our query type to reflectExpressions
query = tap.query('reflectExpressions')

# no params needed

# pass in some test data
string = "This is the first great happy blue angry cold sentence I know. This is the second fantastic sentence."

# query the api
strResult = tap.analyse_text(query, string)

# Print Result
print("-" * 40)
print("Reflect Expressions:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Raw Result:\n\n")
print(json.dumps(strResult, indent=2))

----------------------------------------
Reflect Expressions:
----------------------------------------
Input Text:

 This is the first great happy blue angry cold sentence I know. This is the second fantastic sentence.


Raw Result:


{
  "data": {
    "reflectExpressions": {
      "querytime": 11,
      "analytics": {
        "counts": {
          "wordCount": 18,
          "avgWordLength": 4.555555555555555,
          "sentenceCount": 2,
          "avgSentenceLength": 9
        },
        "summary": {
          "metaTagSummary": {
            "knowledge": 0,
            "experience": 0,
            "regulation": 0,
            "none": 2
          },
          "phraseTagSummary": {
            "outcome": 0,
            "temporal": 0,
            "pertains": 0,
            "consider": 1,
            "anticipate": 0,
            "definite": 0,
            "possible": 0,
            "selfReflexive": 0,
            "emotive": 0,
            "selfPossessive": 0,
            "compare": 0,
 

## Affect Expressions

Affect Expressions is a query that will return stats about the valence, arousal and dominance language used.

You are able to pass in the thresholds at which each of them will trigger.

### Example:

In [68]:
# Set our query type to affectExpressions
query = tap.query('affectExpressions')

# Set our thresholds in a parameter
params = '''
{
    "valence":4,
    "arousal":4,
    "dominance":4
}
'''

# pass in some test data
string = "This is the first great happy blue angry cold sentence I know. This is the second fantastic sentence."

# query the api
strResult = tap.analyse_text(query, string, params)

# Print Result
print("-" * 40)
print("Affect Expressions:")
print("-" * 40)
print("Input Text:\n\n", string)
print("\n")
print("Raw Result:\n\n")
print(json.dumps(strResult, indent=2))

----------------------------------------
Affect Expressions:
----------------------------------------
Input Text:

 This is the first great happy blue angry cold sentence I know. This is the second fantastic sentence.


Raw Result:


{
  "data": {
    "affectExpressions": {
      "querytime": 84,
      "message": "",
      "timestamp": "2019-01-03T03:01:27.529517Z",
      "analytics": [
        {
          "affect": [
            {
              "text": "first",
              "valence": 7.33,
              "arousal": 4.9,
              "dominance": 6.38,
              "startIdx": 3
            },
            {
              "text": "great",
              "valence": 7.5,
              "arousal": 4.14,
              "dominance": 6.65,
              "startIdx": 4
            },
            {
              "text": "happy",
              "valence": 8.47,
              "arousal": 6.05,
              "dominance": 7.21,
              "startIdx": 5
            }
          ]
        },
        {