# Get Character quotes

For doing sentiment analyse, we propose the use of the quotes of each character. For that we need to compilate all the quotes from each character.

That's fine, as inside the wiki, each character has a 'Quotes' page wich can be accessed by `Category:{Name}/Quotes` (https://marvel.fandom.com/wiki/Category:Peter_Parker_(Earth-616)/Quotes). Easy right?. Well, even tho that page exist, if you try to make a query to it, you will most likely get `{{Quotes}}`

In [1]:
import json
import urllib.request

import re

import pandas as pd
import numpy as np


from tqdm.notebook import tqdm

tqdm.pandas()

In [2]:
def searchQuote(title):
  baseurl = "https://marvel.fandom.com/api.php?"
  action = "action=query"
  title = "titles={}".format(urllib.parse.quote_plus(title.replace(" ", "_")))
   
  content = "prop=revisions&rvprop=content&rvslots=*"
  dataformat ="format=json"

  query = "{}{}&{}&{}&{}".format(baseurl, action, content, title, dataformat)
    
  wikiresponse = urllib.request.urlopen(query)
  wikidata = wikiresponse.read()
  wikitext = wikidata.decode('utf-8')
    
  return json.loads(wikitext)

def displayWiki(wiki):
    code = str(list(wiki["query"]["pages"].keys())[0])
    title = wiki["query"]["pages"][code]["title"]
    content = wiki["query"]["pages"][code]["revisions"][0]["slots"]["main"]["*"]
    return content

In [3]:
def get_quotes(character):
  
  cmcontinue_text = ""
  first_time = True
  
  quotes_list = []
  
  while cmcontinue_text or first_time: 
  
    first_time = False
  
    baseurl = "https://marvel.fandom.com/api.php?"
    action = "action=query&list=categorymembers"
    q_title = "cmtitle=Category:{}/Quotes".format(urllib.parse.quote_plus(character.replace(" ", "_")))

    content = "prop=revisions&rvprop=content&rvslots=*"
    dataformat ="format=json"
    

    cmcontinue = "cmlimit=max&cmcontinue={}".format(cmcontinue_text)

    query = "{}{}&{}&{}&{}&{}".format(baseurl, action, q_title, content, dataformat, cmcontinue)
    wikiresponse = urllib.request.urlopen(query)
    wikidata = wikiresponse.read()
    wikitext = wikidata.decode('utf-8')
    
    
    wiki_json = json.loads(wikitext)
    
    pages_with_quote = [page["title"] for page in wiki_json["query"]["categorymembers"]]
    
    for page in pages_with_quote:
      content = displayWiki(searchQuote(page))
    
      quotes_list += re.findall(r"Quotation.*?= (.*?)\n", content)
    
    
    if "continue" in list(wiki_json.keys()):
      cmcontinue_text = wiki_json["continue"]["cmcontinue"]
    else:
      cmcontinue_text = ""
      
  return quotes_list

In [4]:
get_quotes("Venom_(Symbiote)_(Earth-616)")

["Been to many worlds, but '''none''' of them [[Earth|this strange]]. Understood '''feelings''' before, but simple feelings - like colors, bold and bright. Happy. Sad. Angry. Then... met [[Peter Parker (Earth-616)|Spider-Man]]. Feelings got '''complicated'''. Learned guilt. Also the first time I felt '''fear'''. Felt '''agony'''. Learned feeling: '''Betrayal'''. Learned '''first words''' they called me. Monster. Parasite. '''Bad'''.[...] Feels '''good''' to be a hero. Did bad things, too. Can't deny. '''[[MacDonald Gargan (Earth-616)|Mac Gargan]]''' was bad. Thoughts like poison stingers. It was a thrill to kill. '''Knew''' it was bad. Didn't care. Gargan made it '''easy'''. Got to punish Gargan for what he did. He was evil and afraid. [[Lee Price (Earth-616)|Lee Price]] was not afraid. He was a '''soldier'''. Hurt and desperate. I '''trusted''' him. '''Talked''' to him. But Lee was too strong. Didn't want to talk. Didn't want to be a hero. Wanted power. Couldn't stop the bad things he

https://marvel.fandom.com/api.php?action=query&list=categorymembers&cmtitle=Category:Peter_Parker_(Earth-616)/Quotes&cmlimit=500&prop=revisions&rvprop=content&rvslots=*&format=json