## Loading JSON
First of all let's load my simplified JSON file, in this case it'll be Eminem's

In [43]:
import json

with open('dataset/Eminem.json', 'r') as data_file:
    data = json.load(data_file)

Now let's take a look how the JSON's structure presents

In [44]:
data

{'api_path': '/artists/45',
 'description': {'plain': 'A legendary hip-hop icon who started as an underground battle rapper in Detroit, Marshall “Eminem” Bruce Mathers III (1972 – present) has developed a career full of controversy, wild swings, and some of the most noteworthy raps in the history of the genre.\n\nEminem has broken countless barriers, shifting and impacting the culture in several ways. In June 2017, “Stan” was added into the Oxford Dictionary, and in 2019, to the Merriam-Webster dictionary. He was the first rapper to win the Grammy Award for Best Album for three consecutive albums. “Rap God” set the Guinness World Record for most words in a song. He was also the first rapper to win an Oscar. His albums The Marshall Mathers LP and The Eminem Show became certified Diamond by the RIAA in 2011, making him one of the few artists to have more than one Diamond album. This has helped him become the highest selling hip-hop artist of all time. In January 2020, with his 11th studi

It's a a lot information, maybe let's try extracting some selected information like artist's description

In [45]:
data['description']['plain']

'A legendary hip-hop icon who started as an underground battle rapper in Detroit, Marshall “Eminem” Bruce Mathers III (1972 – present) has developed a career full of controversy, wild swings, and some of the most noteworthy raps in the history of the genre.\n\nEminem has broken countless barriers, shifting and impacting the culture in several ways. In June 2017, “Stan” was added into the Oxford Dictionary, and in 2019, to the Merriam-Webster dictionary. He was the first rapper to win the Grammy Award for Best Album for three consecutive albums. “Rap God” set the Guinness World Record for most words in a song. He was also the first rapper to win an Oscar. His albums The Marshall Mathers LP and The Eminem Show became certified Diamond by the RIAA in 2011, making him one of the few artists to have more than one Diamond album. This has helped him become the highest selling hip-hop artist of all time. In January 2020, with his 11th studio album Music to Be Murdered By debuting at #1 on the 

Still a bit unreadable, maybe some simple text formatting can help us

In [46]:
desc = data['description']['plain']
for s in desc.split(sep='\n'):
    print(s)

A legendary hip-hop icon who started as an underground battle rapper in Detroit, Marshall “Eminem” Bruce Mathers III (1972 – present) has developed a career full of controversy, wild swings, and some of the most noteworthy raps in the history of the genre.

Eminem has broken countless barriers, shifting and impacting the culture in several ways. In June 2017, “Stan” was added into the Oxford Dictionary, and in 2019, to the Merriam-Webster dictionary. He was the first rapper to win the Grammy Award for Best Album for three consecutive albums. “Rap God” set the Guinness World Record for most words in a song. He was also the first rapper to win an Oscar. His albums The Marshall Mathers LP and The Eminem Show became certified Diamond by the RIAA in 2011, making him one of the few artists to have more than one Diamond album. This has helped him become the highest selling hip-hop artist of all time. In January 2020, with his 11th studio album Music to Be Murdered By debuting at #1 on the Bil

That's better. Now let's try listing all of Eminem's song titles in this JSON

In [47]:
songs = data['songs']
for song in songs:
    print(song['title'])

Rap God
Killshot
Godzilla
Lose Yourself
The Monster
Lucky You
The Ringer
River
Venom
Stan
Berzerk
Without Me
Not Alike
Fall
The Real Slim Shady
’Till I Collapse
Kamikaze
Walk on Water
Love the Way You Lie
Bad Guy
8 Mile: B-Rabbit vs Papa Doc
Mockingbird
Not Afraid
Headlights
No Love
Survival
Beautiful
Greatest
When I’m Gone
Cleanin’ Out My Closet
My Name Is
Love Game
Superman
The Way I Am
Legacy
Unaccommodating
Space Bound
Like Toy Soldiers
Guts Over Fear
Sing for the Moment
Gnat
Marshall Mathers
Stronger Than I Was
Believe
Kill You
Evil Twin
Detroit vs. Everybody
Untouchable
Darkness
Criminal
Kim
So Much Better
Beautiful Pain
I’m Back
You Gon’ Learn
Rhyme or Reason
Campaign Speech
Chloraseptic (Remix)
White America
Stepping Stone
Just Don’t Give a Fuck
Infinite
Kings Never Die
Asshole
Wicked Ways
Good Guy
So Far...
Shake That
Phenomenal
Normal
Role Model
Premonition (Intro)
Zeus
Guilty Conscience
Offended
Brainless
FACK
25 to Life
Leaving Heaven
Those Kinda Nights
Hailie’s Song
Ass Li

Now let's choose his first song, in this example 'Rap God' and try some simple processing, staring from checking its raw type, length and content

In [48]:
lyrics = songs[0]['lyrics']
print('Type: ', type(lyrics), ', Length: ', len(lyrics))

Type:  <class 'str'> , Length:  8051


In [49]:
lyrics

'[Intro]\n"Look, I was gonna go easy on you not to hurt your feelings"\n"But I\'m only going to get this one chance" (Six minutes— Six minutes—)\n"Something\'s wrong, I can feel it" (Six minutes, Slim Shady, you\'re on!)\n"Just a feeling I\'ve got, like something\'s about to happen, but I don\'t know what.\xa0\nIf that means what I think it means, we\'re in trouble, big trouble;\xa0\nAnd if he is as bananas as you say, I\'m not taking any chances"\n"You are just what the doc ordered"\n\n[Chorus]\nI\'m beginnin\' to feel like a Rap God, Rap God\nAll my people from the front to the back nod, back nod\nNow, who thinks their arms are long enough to slap box, slap box?\nThey said I rap like a robot, so call me Rap-bot\n\n[Verse 1]\nBut for me to rap like a computer it must be in my genes\nI got a laptop in my back pocket\nMy pen\'ll go off when I half-cock it\nGot a fat knot from that rap profit\nMade a livin\' and a killin\' off it\nEver since Bill Clinton was still in office\nWith Monica 

Once again, a bit unreadble, but we'll sort it using previous approach

In [50]:
for s in lyrics.split(sep='\n'):
    print(s)

[Intro]
"Look, I was gonna go easy on you not to hurt your feelings"
"But I'm only going to get this one chance" (Six minutes— Six minutes—)
"Something's wrong, I can feel it" (Six minutes, Slim Shady, you're on!)
"Just a feeling I've got, like something's about to happen, but I don't know what. 
If that means what I think it means, we're in trouble, big trouble; 
And if he is as bananas as you say, I'm not taking any chances"
"You are just what the doc ordered"

[Chorus]
I'm beginnin' to feel like a Rap God, Rap God
All my people from the front to the back nod, back nod
Now, who thinks their arms are long enough to slap box, slap box?
They said I rap like a robot, so call me Rap-bot

[Verse 1]
But for me to rap like a computer it must be in my genes
I got a laptop in my back pocket
My pen'll go off when I half-cock it
Got a fat knot from that rap profit
Made a livin' and a killin' off it
Ever since Bill Clinton was still in office
With Monica Lewinsky feelin' on his nutsack
I'm an MC 

## JSON processing and NLTK 
Now let's start working with NLTK and tokenize the text - produce a list of words and punctuations from the text. We'll use NLTKs functions for it but also some basic string functions for comparision sake. After that check the type and length once again. But first of all some important libraries to work on strings.

In [51]:
from __future__ import division
import nltk, re, pprint

tokens = nltk.word_tokenize(lyrics)
print('Type: ', type(tokens), ', Length: ', len(tokens))

Type:  <class 'list'> , Length:  1871


In [52]:
tokens

['[',
 'Intro',
 ']',
 "''",
 'Look',
 ',',
 'I',
 'was',
 'gon',
 'na',
 'go',
 'easy',
 'on',
 'you',
 'not',
 'to',
 'hurt',
 'your',
 'feelings',
 "''",
 "''",
 'But',
 'I',
 "'m",
 'only',
 'going',
 'to',
 'get',
 'this',
 'one',
 'chance',
 "''",
 '(',
 'Six',
 'minutes—',
 'Six',
 'minutes—',
 ')',
 "''",
 'Something',
 "'s",
 'wrong',
 ',',
 'I',
 'can',
 'feel',
 'it',
 "''",
 '(',
 'Six',
 'minutes',
 ',',
 'Slim',
 'Shady',
 ',',
 'you',
 "'re",
 'on',
 '!',
 ')',
 '``',
 'Just',
 'a',
 'feeling',
 'I',
 "'ve",
 'got',
 ',',
 'like',
 'something',
 "'s",
 'about',
 'to',
 'happen',
 ',',
 'but',
 'I',
 'do',
 "n't",
 'know',
 'what',
 '.',
 'If',
 'that',
 'means',
 'what',
 'I',
 'think',
 'it',
 'means',
 ',',
 'we',
 "'re",
 'in',
 'trouble',
 ',',
 'big',
 'trouble',
 ';',
 'And',
 'if',
 'he',
 'is',
 'as',
 'bananas',
 'as',
 'you',
 'say',
 ',',
 'I',
 "'m",
 'not',
 'taking',
 'any',
 'chances',
 "''",
 "''",
 'You',
 'are',
 'just',
 'what',
 'the',
 'doc',
 'order

In [53]:
re.split(r'\W+', lyrics)

['',
 'Intro',
 'Look',
 'I',
 'was',
 'gonna',
 'go',
 'easy',
 'on',
 'you',
 'not',
 'to',
 'hurt',
 'your',
 'feelings',
 'But',
 'I',
 'm',
 'only',
 'going',
 'to',
 'get',
 'this',
 'one',
 'chance',
 'Six',
 'minutes',
 'Six',
 'minutes',
 'Something',
 's',
 'wrong',
 'I',
 'can',
 'feel',
 'it',
 'Six',
 'minutes',
 'Slim',
 'Shady',
 'you',
 're',
 'on',
 'Just',
 'a',
 'feeling',
 'I',
 've',
 'got',
 'like',
 'something',
 's',
 'about',
 'to',
 'happen',
 'but',
 'I',
 'don',
 't',
 'know',
 'what',
 'If',
 'that',
 'means',
 'what',
 'I',
 'think',
 'it',
 'means',
 'we',
 're',
 'in',
 'trouble',
 'big',
 'trouble',
 'And',
 'if',
 'he',
 'is',
 'as',
 'bananas',
 'as',
 'you',
 'say',
 'I',
 'm',
 'not',
 'taking',
 'any',
 'chances',
 'You',
 'are',
 'just',
 'what',
 'the',
 'doc',
 'ordered',
 'Chorus',
 'I',
 'm',
 'beginnin',
 'to',
 'feel',
 'like',
 'a',
 'Rap',
 'God',
 'Rap',
 'God',
 'All',
 'my',
 'people',
 'from',
 'the',
 'front',
 'to',
 'the',
 'back',


As you can see, the headers such as [Intro] and do on are messing the structure of the lyrics and it's hard to retrive meaningful data, let's get rid of them. 

In [54]:
cleared_lyrics = re.sub("\[.*?\]", "", lyrics)
cleared_lyrics

'\n"Look, I was gonna go easy on you not to hurt your feelings"\n"But I\'m only going to get this one chance" (Six minutes— Six minutes—)\n"Something\'s wrong, I can feel it" (Six minutes, Slim Shady, you\'re on!)\n"Just a feeling I\'ve got, like something\'s about to happen, but I don\'t know what.\xa0\nIf that means what I think it means, we\'re in trouble, big trouble;\xa0\nAnd if he is as bananas as you say, I\'m not taking any chances"\n"You are just what the doc ordered"\n\n\nI\'m beginnin\' to feel like a Rap God, Rap God\nAll my people from the front to the back nod, back nod\nNow, who thinks their arms are long enough to slap box, slap box?\nThey said I rap like a robot, so call me Rap-bot\n\n\nBut for me to rap like a computer it must be in my genes\nI got a laptop in my back pocket\nMy pen\'ll go off when I half-cock it\nGot a fat knot from that rap profit\nMade a livin\' and a killin\' off it\nEver since Bill Clinton was still in office\nWith Monica Lewinsky feelin\' on his

Now we have tidied text that we can work on, but even better would be group the fragments of the text and store them in the dictionary on which we can perform futher experiments

In [55]:
headers = {}
for m in re.finditer('\[.*?\]', lyrics):
    headers[m.start()] = m.group()
headers

{0: '[Intro]',
 468: '[Chorus]',
 694: '[Verse 1]',
 1786: '[Chorus]',
 2108: '[Verse 2]',
 3590: '[Chorus]',
 3883: '[Verse 3]'}

In [56]:
fragments = {}
key_list = sorted(headers.keys())
i = 0

for index, k in enumerate(headers.keys()):
    next = index + 1
    new_key = re.sub('\[|\]', '', headers[key_list[index]])
    
    if index == len(headers)-1:   
        text = lyrics[k + len(headers[key_list[index]]):]
    else:
        text = lyrics[k + len(headers[key_list[index]]):key_list[next]]
        
    if new_key in fragments.keys():
        i += 1
        fragments[new_key + str(i)] = text
    else:
        fragments[new_key] = text

Now we can see the structure of our new dictonary

In [57]:
for k, v in fragments.items():
    print(k, v)

Intro 
"Look, I was gonna go easy on you not to hurt your feelings"
"But I'm only going to get this one chance" (Six minutes— Six minutes—)
"Something's wrong, I can feel it" (Six minutes, Slim Shady, you're on!)
"Just a feeling I've got, like something's about to happen, but I don't know what. 
If that means what I think it means, we're in trouble, big trouble; 
And if he is as bananas as you say, I'm not taking any chances"
"You are just what the doc ordered"


Chorus 
I'm beginnin' to feel like a Rap God, Rap God
All my people from the front to the back nod, back nod
Now, who thinks their arms are long enough to slap box, slap box?
They said I rap like a robot, so call me Rap-bot


Verse 1 
But for me to rap like a computer it must be in my genes
I got a laptop in my back pocket
My pen'll go off when I half-cock it
Got a fat knot from that rap profit
Made a livin' and a killin' off it
Ever since Bill Clinton was still in office
With Monica Lewinsky feelin' on his nutsack
I'm an MC s

Ok, now let's check how many words every fragment has, to do that all of them will be tokenized, and we'll print the length of the list that will be created by that.

In [58]:
for k, v in fragments.items():
    tokens = re.split(r'\W+', v)
    print(k, ' - words: ' + str(len(tokens)))

Intro  - words: 97
Chorus  - words: 50
Verse 1  - words: 222
Chorus1  - words: 66
Verse 2  - words: 304
Chorus2  - words: 58
Verse 3  - words: 867


Having that information we'll use the same functions to process all of the rappers from the dataset and create some interesting visualization of the expirments, but now let's get back to the entire text. First of all let's check the tokenization for it.

In [59]:
tokens = nltk.word_tokenize(cleared_lyrics)
len(tokens)

1847

In [60]:
tokens = re.split(r'\W+', cleared_lyrics)
len(tokens)

1652

Seeing how the official numbers of this song are closer to our tokenization usings regexes we'll go with it, now time for some text normalization, first a process called stemming which is stripping off any affixes.

In [61]:
porter = nltk.PorterStemmer()
p_stem = [porter.stem(t) for t in tokens]
p_stem

['',
 'look',
 'I',
 'wa',
 'gonna',
 'go',
 'easi',
 'on',
 'you',
 'not',
 'to',
 'hurt',
 'your',
 'feel',
 'but',
 'I',
 'm',
 'onli',
 'go',
 'to',
 'get',
 'thi',
 'one',
 'chanc',
 'six',
 'minut',
 'six',
 'minut',
 'someth',
 's',
 'wrong',
 'I',
 'can',
 'feel',
 'it',
 'six',
 'minut',
 'slim',
 'shadi',
 'you',
 're',
 'on',
 'just',
 'a',
 'feel',
 'I',
 've',
 'got',
 'like',
 'someth',
 's',
 'about',
 'to',
 'happen',
 'but',
 'I',
 'don',
 't',
 'know',
 'what',
 'If',
 'that',
 'mean',
 'what',
 'I',
 'think',
 'it',
 'mean',
 'we',
 're',
 'in',
 'troubl',
 'big',
 'troubl',
 'and',
 'if',
 'he',
 'is',
 'as',
 'banana',
 'as',
 'you',
 'say',
 'I',
 'm',
 'not',
 'take',
 'ani',
 'chanc',
 'you',
 'are',
 'just',
 'what',
 'the',
 'doc',
 'order',
 'I',
 'm',
 'beginnin',
 'to',
 'feel',
 'like',
 'a',
 'rap',
 'god',
 'rap',
 'god',
 'all',
 'my',
 'peopl',
 'from',
 'the',
 'front',
 'to',
 'the',
 'back',
 'nod',
 'back',
 'nod',
 'now',
 'who',
 'think',
 'their

In [62]:
lancaster = nltk.LancasterStemmer()
l_stem = [lancaster.stem(t) for t in tokens]
l_stem

['',
 'look',
 'i',
 'was',
 'gonn',
 'go',
 'easy',
 'on',
 'you',
 'not',
 'to',
 'hurt',
 'yo',
 'feel',
 'but',
 'i',
 'm',
 'on',
 'going',
 'to',
 'get',
 'thi',
 'on',
 'chant',
 'six',
 'minut',
 'six',
 'minut',
 'someth',
 's',
 'wrong',
 'i',
 'can',
 'feel',
 'it',
 'six',
 'minut',
 'slim',
 'shady',
 'you',
 're',
 'on',
 'just',
 'a',
 'feel',
 'i',
 've',
 'got',
 'lik',
 'someth',
 's',
 'about',
 'to',
 'hap',
 'but',
 'i',
 'don',
 't',
 'know',
 'what',
 'if',
 'that',
 'mean',
 'what',
 'i',
 'think',
 'it',
 'mean',
 'we',
 're',
 'in',
 'troubl',
 'big',
 'troubl',
 'and',
 'if',
 'he',
 'is',
 'as',
 'banana',
 'as',
 'you',
 'say',
 'i',
 'm',
 'not',
 'tak',
 'any',
 'chant',
 'you',
 'ar',
 'just',
 'what',
 'the',
 'doc',
 'ord',
 'i',
 'm',
 'beginnin',
 'to',
 'feel',
 'lik',
 'a',
 'rap',
 'god',
 'rap',
 'god',
 'al',
 'my',
 'peopl',
 'from',
 'the',
 'front',
 'to',
 'the',
 'back',
 'nod',
 'back',
 'nod',
 'now',
 'who',
 'think',
 'their',
 'arm',
 

Lemmatizer on the other hand removes affixes only if the resulting word is in its dictionary.

In [63]:
wnl = nltk.WordNetLemmatizer()
lemmatizer = [wnl.lemmatize(t) for t in tokens]
lemmatizer

['',
 'Look',
 'I',
 'wa',
 'gonna',
 'go',
 'easy',
 'on',
 'you',
 'not',
 'to',
 'hurt',
 'your',
 'feeling',
 'But',
 'I',
 'm',
 'only',
 'going',
 'to',
 'get',
 'this',
 'one',
 'chance',
 'Six',
 'minute',
 'Six',
 'minute',
 'Something',
 's',
 'wrong',
 'I',
 'can',
 'feel',
 'it',
 'Six',
 'minute',
 'Slim',
 'Shady',
 'you',
 're',
 'on',
 'Just',
 'a',
 'feeling',
 'I',
 've',
 'got',
 'like',
 'something',
 's',
 'about',
 'to',
 'happen',
 'but',
 'I',
 'don',
 't',
 'know',
 'what',
 'If',
 'that',
 'mean',
 'what',
 'I',
 'think',
 'it',
 'mean',
 'we',
 're',
 'in',
 'trouble',
 'big',
 'trouble',
 'And',
 'if',
 'he',
 'is',
 'a',
 'banana',
 'a',
 'you',
 'say',
 'I',
 'm',
 'not',
 'taking',
 'any',
 'chance',
 'You',
 'are',
 'just',
 'what',
 'the',
 'doc',
 'ordered',
 'I',
 'm',
 'beginnin',
 'to',
 'feel',
 'like',
 'a',
 'Rap',
 'God',
 'Rap',
 'God',
 'All',
 'my',
 'people',
 'from',
 'the',
 'front',
 'to',
 'the',
 'back',
 'nod',
 'back',
 'nod',
 'Now',

Looks like it done the best with the given set, now let's check the 50 most frequent words in the text

In [64]:
text = nltk.Text(tokens)
print('Type: ', type(text), ' length: ', len(text))

Type:  <class 'nltk.text.Text'>  length:  1652


In [65]:
from nltk.probability import FreqDist

fdist = FreqDist(text)
vocab = list(fdist)
vocab[:50]

['I',
 'a',
 'the',
 'to',
 'you',
 'm',
 'and',
 's',
 'it',
 'in',
 'of',
 'that',
 'be',
 'like',
 'me',
 'But',
 'my',
 'boy',
 'get',
 't',
 'as',
 'from',
 'lookin',
 'what',
 'You',
 'back',
 'with',
 'on',
 'And',
 'll',
 'they',
 're',
 'but',
 'know',
 'for',
 'when',
 'was',
 'say',
 'are',
 'Rap',
 'God',
 'rap',
 'still',
 'at',
 'make',
 'not',
 'this',
 'can',
 'don',
 'if']

As we can see one of the most frequently used words is "you", we can check its appearance in the text

In [66]:
text.concordance('you')

Displaying 25 of 43 matches:
 Look I was gonna go easy on you not to hurt your feelings But I m onl
 I can feel it Six minutes Slim Shady you re on Just a feeling I ve got like so
ig trouble And if he is as bananas as you say I m not taking any chances You ar
as you say I m not taking any chances You are just what the doc ordered I m beg
This flippity dippity hippity hip hop You don t really wanna get into a pissin 
ough to slap box slap box Let me show you maintainin this shit ain t that hard 
W A Cube hey Doc Ren Yella Eazy thank you they got Slim Inspired enough to one 
 alcohol of fame On the wall of shame You fags think it s all a game til I walk
lank and tell me what in the fuck are you thinkin Little gay lookin boy So gay 
with a straight face lookin boy Ha ha You re witnessin a mass occur Like you re
ha You re witnessin a mass occur Like you re watching a church gathering take p
 s gay that s all they say lookin boy You get a thumbs up pat on the back And a
ry day lookin boy He

Now time for collocations - a sequence of words that occur together unusually often

In [67]:
text.collocations()

lookin boy; Rap God; back nod; slap box; Six minutes; hip hop; long
enough; God Rap; nod back; feel like; box slap; rap like; face lookin;
say lookin


Checking how many distinct words have been used in this song

In [68]:
vocab = sorted(set(tokens))
distinct_words = len(vocab)
distinct_words

671

Ok, time for some POS tagging - process of classifying words into their parts-of-speech and labeling them accordingly

In [69]:
pos_tagger = nltk.pos_tag([i for i in tokens if i])
pos_tagger

[('Look', 'NN'),
 ('I', 'PRP'),
 ('was', 'VBD'),
 ('gonna', 'VBN'),
 ('go', 'VBP'),
 ('easy', 'JJ'),
 ('on', 'IN'),
 ('you', 'PRP'),
 ('not', 'RB'),
 ('to', 'TO'),
 ('hurt', 'VB'),
 ('your', 'PRP$'),
 ('feelings', 'NNS'),
 ('But', 'CC'),
 ('I', 'PRP'),
 ('m', 'VBP'),
 ('only', 'RB'),
 ('going', 'VBG'),
 ('to', 'TO'),
 ('get', 'VB'),
 ('this', 'DT'),
 ('one', 'CD'),
 ('chance', 'NN'),
 ('Six', 'NNP'),
 ('minutes', 'NNS'),
 ('Six', 'NNP'),
 ('minutes', 'NNS'),
 ('Something', 'VBG'),
 ('s', 'JJ'),
 ('wrong', 'JJ'),
 ('I', 'PRP'),
 ('can', 'MD'),
 ('feel', 'VB'),
 ('it', 'PRP'),
 ('Six', 'NNP'),
 ('minutes', 'NNS'),
 ('Slim', 'NNP'),
 ('Shady', 'NNP'),
 ('you', 'PRP'),
 ('re', 'VBP'),
 ('on', 'IN'),
 ('Just', 'NNP'),
 ('a', 'DT'),
 ('feeling', 'NN'),
 ('I', 'PRP'),
 ('ve', 'VBP'),
 ('got', 'NNS'),
 ('like', 'IN'),
 ('something', 'NN'),
 ('s', 'NNS'),
 ('about', 'IN'),
 ('to', 'TO'),
 ('happen', 'VB'),
 ('but', 'CC'),
 ('I', 'PRP'),
 ('don', 'VBP'),
 ('t', 'RB'),
 ('know', 'VBP'),
 ('what',

In the documentation we can find that for example CC stands for coordinating conjunction, RB stands for adverb, IN stands for preposition, NN - noun JJ - adjective. Meaning of the tags can always be found using:

In [70]:
nltk.download('tagsets')
nltk.help.upenn_tagset('RB')

RB: adverb
    occasionally unabatingly maddeningly adventurously professedly
    stirringly prominently technologically magisterially predominately
    swiftly fiscally pitilessly ...


[nltk_data] Downloading package tagsets to
[nltk_data]     C:\Users\korni\AppData\Roaming\nltk_data...
[nltk_data]   Package tagsets is already up-to-date!


Also which tag is most likely

In [71]:
nltk.FreqDist(pos_tagger).max()

('I', 'PRP')