# Spoken Language Processing 2022-23

# Lab3 - Dialogue Systems

_Bruno Martins_


This lab assignment will introduce tools and concepts related to the development of dialogue systems, exemplifying also the use of automatic speech recognition and text-to-speech models.

Students will be tasked with the development of a simple (spoken/conversational) question answering system, reusing different models associated to the HuggingFace Transformers library:

* Speech recognition models (e.g., OpenAI Whisper).
* Large language models for natural language understanding and generation (e.g., GPT-2 or Alpaca models).
* Text-to-speech models (e.g., SpeechT5).

The first parts of this notebook will guide students in the use of the tools, while the last part presents the main problem that is to be tackled. Note that the first parts also feature intermediate tasks, which students are required to solve.

To complete the project, student groups must deliver in Fenix an updated version of this notebook, featuring the proposed solutions to each task, together with a small PDF report (2 pages) outlining the methods that were developed (you can use the [following Overleaf template](https://www.overleaf.com/latex/templates/interspeech-2023-paper-kit/kzcdqdmkqvbr) for the report).

Students are encouraged to modify examples, incorporate any other techniques, and in general explore any approach that may permit improving the results. Assessment will be based on task completion, creativity in the proposed solutions, and overall accuracy over a benchmark dataset.

### Group identification

Initialize the variable `group_id` with the number that Fenix assigned to your group and `student1_name`, `student1_id`, `student2_name` and `student2_id` with your names and student numbers.

In [26]:
# # YOUR CODE HERE
# raise NotImplementedError()
group_id = 9
student1_name = "Afonso Araújo"
student1_id = 96138
student2_name = "Santiago Quintas"
student2_id = 93179
print(f"Group number: {group_id}")
print(f"Student 1: {student1_name} ({student1_id})")
print(f"Student 2: {student2_name} ({student2_id})")

Group number: 9
Student 1: Afonso Araújo (96138)
Student 2: Santiago Quintas (93179)


In [27]:
assert isinstance(group_id, int) and isinstance(student1_id, int) and isinstance(student2_id, int)
assert isinstance(student1_name, str) and isinstance(student2_name, str) 
assert (group_id > 0) and (group_id < 40)
assert (student1_id > 60000) and (student1_id < 120000) and (student2_id > 60000) and (student2_id < 120000)

# Python packages

NumPy is a Python library that provides functions to process multidimensional array objects. The NumPy documentation is available [here](https://numpy.org/doc/1.24/).

Librosa is a Python package for analyzing and processing audio signals. It provides a wide range of tools for tasks such as loading and manipulating audio files, extracting features from audio signals, and visualizing and playing back audio data.

IPython display is a module in the IPython interactive computing environment that provides a set of functions for displaying various types of media in the Jupyter notebook or other IPython-compatible environments. For example, you can use the display() function to display an object in a notebook cell (for example an audio object).

Matplotlib is a popular Python library that allows users to create a wide range of visualizations using a simple and intuitive syntax.

Huggingface transformers provides APIs and tools to easily download and train state-of-the-art pretrained models based on the Transformer architecture. The documentation is available [here](https://huggingface.co/docs/transformers/index). The associated *datasets* and *evaluate* libraries respectivly suport the direct access to many well-known datasets and common evaluation metrics used in NLP and speech research. For more details, look at the official [HuggingFace course](https://huggingface.co/course/chapter1/1).

In [28]:
# !pip3 install sentencepiece
# !pip3 install xformers
# !pip3 install transformers
# !pip3 install datasets
# !pip3 install evaluate
# !pip3 install jiwer
# !pip3 install librosa

In [29]:
import transformers
import datasets
import evaluate
import numpy as np
import librosa
import librosa.display
from IPython.display import Audio
from matplotlib import pyplot as plt

# Using OpenAI Whisper

Whisper is an exciting new model for Automatic Speech Recognition (ASR), developed by OpenAI and made available through the HuggingFace Transformers library. The following example illustrates the use of the Whisper model to transcribe a small audio sample taken from the LibriSpeech dataset.

In [42]:
import torch
from transformers import AutoProcessor, WhisperForConditionalGeneration
from datasets import load_dataset

ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
# 3 instancias - 1 paragrafo contexto - pergunta - resposta
processor_whisper = AutoProcessor.from_pretrained("openai/whisper-tiny.en")
model_whisper = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny.en")
inputs = processor_whisper(ds[0]["audio"]["array"], return_tensors="pt")
input_features = inputs.input_features
generated_ids = model_whisper.generate(inputs=input_features)
transcription = processor_whisper.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(transcription)

Found cached dataset librispeech_asr_dummy (C:/Users/afons/.cache/huggingface/datasets/hf-internal-testing___librispeech_asr_dummy/clean/2.1.0/d3bc4c2bc2078fcde3ad0f0f635862e4c0fef78ba94c4a34c4c250a097af240b)
It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.


 Mr. Quilter is the apostle of the middle classes, and we are glad to welcome his gospel.


Automatic Speech Recognition (ASR) models are frequently evaluated through the Word Error Rate (WER). 

The WER is derived from the Levenshtein distance, working at the word level and aligning the recognized word sequence with the reference (spoken) word sequence using dynamic string alignment. The metric can then be computed as:

WER = (S + D + I) / N = (S + D + I) / (S + D + C),

where S is the number of substitutions, D is the number of deletions, I is the number of insertions, C is the number of correct words, and N is the number of words in the reference (N=S+D+C). The WER value indicates the average number of errors per reference word. The lower the value, the better the performance of the ASR system, with a WER of 0 being a perfect score.

In [31]:
from evaluate import load

wer = load("wer")
predictions = ["this is the prediction", "there is an other sample"]
references = ["this is the reference", "there is another one"]
wer_score = wer.compute(predictions=predictions, references=references)

print(wer_score)

0.5


## Intermediate tasks:

* Collect two small audio samples with your own voice, together with a transcription of the spoken messages. The following [example shows how to record audio from your microphone within a Python notebook](https://colab.research.google.com/gist/ricardodeazambuja/03ac98c31e87caf284f7b06286ebf7fd/microphone-to-numpy-array-from-your-browser-in-colab.ipynb#scrollTo=H4rxNhsEpr-c), but you can use any other method to collect the audio samples.
* Use the Whisper speech recognition model to transcribe the two spoken messages that were collected.
* Use the transcriptions to compute the word error rate.
* Experiment with the use of different recognition models (e.g., larger Whisper models), and see if the error rate changes.

In [44]:
# Add your solutions to the exercises
# in this and other cells here.
# Collect two samples of audio in directory `audio_samples`
# and load them here.
# # YOUR CODE HERE
# raise NotImplementedError()
audio1, sr1 = librosa.load("helloHowRu.wav")
# resample using librosa
audio1 = librosa.resample(audio1, orig_sr = 22050, target_sr=16000)
audio2, sr2 = librosa.load("whereRuFrom.wav")
audio2 = librosa.resample(audio2, orig_sr = 22050, target_sr=16000)
# use the whisper speech recognition model to transcribe the two audios
inputs1 = processor_whisper(audio1, return_tensors="pt")
input_features1 = inputs1.input_features
generated_ids1 = model_whisper.generate(inputs=input_features1)
transcription1 = processor_whisper.batch_decode(generated_ids1, skip_special_tokens=True)[0]
inputs2 = processor_whisper(audio2, return_tensors="pt")
input_features2 = inputs2.input_features
generated_ids2 = model_whisper.generate(inputs=input_features2)
transcription2 = processor_whisper.batch_decode(generated_ids2, skip_special_tokens=True)[0]
# print the transcriptions
# # YOUR CODE HERE
print(transcription1)
print(transcription2)
# raise NotImplementedError()
# compute the WER of the two transcriptions
wer_score = wer.compute(predictions=[transcription1, transcription2], references=["this is the reference", "there is another one"])
print(wer_score)

display(Audio(audio1, rate=16000))
display(Audio(audio2, rate=16000))
# raise NotImplementedError()
#experiment with the use of different large whisper models
processor_whisper = AutoProcessor.from_pretrained("openai/whisper-tiny.en")
model_whisper = WhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny.en")
inputs = processor_whisper(ds[0]["audio"]["array"], return_tensors="pt")
input_features = inputs.input_features
generated_ids = model_whisper.generate(inputs=input_features)
transcription = processor_whisper.batch_decode(generated_ids, skip_special_tokens=True)[0]
# and see if the WER changes
wer_score = wer.compute(predictions=[transcription], references=["this is the reference"])
print(wer_score)



It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.


It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.


 Hello, how are you?
 Where are you from?
1.0


It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.


3.75


# Using LLMs for conditional language generation

OpenAI GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text. The diversity of the dataset causes this simple goal to contain naturally occurring demonstrations of many tasks across diverse domains. Thus, GPT-2 can be used to address problems like question answering, modeling the task as language generation conditioned in the question (plus other relevant additional context).

The following example illustrates the use of the GPT-2 through the Huggingface Transformers library. In this case, instead of using the model directly, we are using the model through the pipeline API, which facilitates the adaptation to the case of other LLMs. The pipeline() function can be used to connect a model with its necessary preprocessing and postprocessing steps, allowing us to directly input any text and get an intelligible answer.

In [45]:
from transformers import pipeline, set_seed

set_seed(42) # make results deterministic

generator = pipeline('text-generation', model='gpt2')
generator("Who is the president of the United States? The answer is", max_length=15, num_return_sequences=1)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'Who is the president of the United States? The answer is: Donald Trump'}]

## Intermediate tasks:

* Adapt the example showing how to use GPT-2 to do question answering over the [SQuAD dataset](https://rajpurkar.github.io/SQuAD-explorer/) (available from HuggingFace datasets).
* Evaluate the results obtained with different models (e.g., Alpaca-based models) and/or different usage strategies (e.g., consider prompting, parameter efficient fine-tuning, etc.).
* Compute the error over the testing split from the SQuAD dataset, using the [official metric](https://huggingface.co/spaces/evaluate-metric/squad).


In [53]:
# Access SQuAD dataset
ds = load_dataset("squad", split="validation")
bleu = load("bleu")

# Access the first 50 samples
sample_size = 6
questions_vec = [ds[i]["question"] for i in range(sample_size)]
answers_vec = [ds[i]["answers"]["text"] for i in range(sample_size)]
contexts_vec = [ds[i]["context"] for i in range(sample_size)]

model_names = ['gpt2', 'bert-large-uncased-whole-word-masking-finetuned-squad',
               'albert-base-v2', 'distilbert-base-uncased-distilled-squad',
               'xlnet-base-cased', 'roberta-base']

predictions = {model_name: [] for model_name in model_names}

# Evaluate the results obtained with different models
for i in range(sample_size):
    for model_name in model_names:
        rator = pipeline('question-answering', model=model_name)
        result = rator({'question': questions_vec[i], 'context': contexts_vec[i]})
        predictions[model_name].append(result['answer'])

# Compute metrics for each model
for model_name in model_names:
    print(f"Metric for {model_name}...")
    print(bleu.compute(predictions=predictions[model_name], references=answers_vec[:10]))


Found cached dataset squad (C:/Users/afons/.cache/huggingface/datasets/squad/plain_text/1.0.0/d6ec3ceb99ca480ce37cdd35555d6cb2511d223b9150cce08a837ef62ffea453)


Question: Which NFL team represented the AFC at Super Bowl 50?
Question: Which NFL team represented the NFC at Super Bowl 50?
Question: Where did Super Bowl 50 take place?
Question: Which NFL team won Super Bowl 50?
Question: What color was used to emphasize the 50th anniversary of the Super Bowl?
Question: What was the theme of Super Bowl 50?


# Using SpeechT5 for converting text-to-speech

Motivated by the success of T5 (Text-To-Text Transfer Transformer) in different natural language processing tasks, the unified-modal SpeechT5 framework explores encoder-decoder pre-training for self-supervised speech/text representation learning. 

The model is again conveniently available through the HuggingFace Transformers library. The following example illustrates the use of the SpeechT5 model for generating a spectrogram from a textual input, together with a neural vocoder model for producing a speech signal.

In [47]:
from transformers import SpeechT5Processor, SpeechT5ForTextToSpeech, SpeechT5HifiGan, set_seed
from IPython.display import Audio
import soundfile as sf
import torch

set_seed(42) # make results deterministic

model_T5 = SpeechT5ForTextToSpeech.from_pretrained("microsoft/speecht5_tts")
vocoder = SpeechT5HifiGan.from_pretrained("microsoft/speecht5_hifigan")
processor_T5 = SpeechT5Processor.from_pretrained("microsoft/speecht5_tts")

inputs = processor_T5(text="I have brought peace, freedom, justice and security to my new empire! Your new empire? Don't make me hurt you! Anakin, my allegiance is to the republic to DEMOCRACY! If you are not with me, then you are my enemy. Only a sith deals in absolutes... I will do what i must! You will try...", return_tensors="pt")
speaker_embeddings = torch.zeros((1, 512))

speech = model_T5.generate_speech(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)
sf.write("tts_example.wav", speech.numpy(), samplerate=16000)
Audio("tts_example.wav", autoplay=True)

## Intermediate tasks:

* Connect the results from your answer to the previous intermediate task (i.e., conditioned language generation) to the SpeechT5 text-to-speech model, so as to produce speech outputs from the text generated by the model.
* Produce speech-based answers for the first 5 questions in the testing split from the SQuaD dataset.
* Connect also the results from your answer to the first intermediate task (i.e., automated speech recognition) to the SpeechT5 model and the LLM, so as to take spoken questions as input.
* Collect small audio samples, with your own voice, for the first 5 questions in the testing split from the SQuaD dataset, and produce speech-based answers for these five questions.


In [66]:
# TTS FROM MODELS

model_names = ['gpt2', 'bert-large-uncased-whole-word-masking-finetuned-squad', 'albert-base-v2', 'distilbert-base-uncased-distilled-squad', 'xlnet-base-cased']

for i in range(0, 1):
    for model_name in model_names:
        rator = pipeline('question-answering', model=model_name)
        print(f"Current model is {model_name}...")
        result = rator({'question': questions_vec[i], 'context': contexts_vec[i]})
        print("Question is - ", questions_vec[i])
        inputs = processor_T5(text=result["answer"], return_tensors="pt")
        print("Answer from model is - ", result["answer"])
        speaker_embeddings = torch.zeros((1, 512))
        speech = model_T5.generate_speech(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)
        sf.write("tts_example.wav", speech.numpy(), samplerate=16000)
        display(Audio("tts_example.wav", autoplay=False))
    
# # TTS FROM AUDIO FILES


audio1, sr1 = librosa.load("helloHowRu.wav")
# resample using librosa
audio1 = librosa.resample(audio1, orig_sr = 22050, target_sr=16000)
audio2, sr2 = librosa.load("whereRuFrom.wav")
audio2 = librosa.resample(audio2, orig_sr = 22050, target_sr=16000)
audio_names = [audio1, audio2]
for audio in audio_names:
    for model_name in model_names:
        print(f"Current model is {model_name}...")
        inputs = processor_whisper(audio, return_tensors="pt")
        input_features = inputs.input_features
        generated_ids = model_whisper.generate(inputs=input_features)
        transcription = processor_whisper.batch_decode(generated_ids, skip_special_tokens=True)[0]
        print("Question is - ", transcription)
        rator = pipeline('question-answering', model=model_name)
        result = rator({'question': transcription, 'context': transcription})
        print("Answer from model is - ", result["answer"])
        answer_input = processor_T5(text=result["answer"], return_tensors="pt")
        speaker_embeddings = torch.zeros((1, 512))
        speech = model_T5.generate_speech(answer_input["input_ids"], speaker_embeddings, vocoder=vocoder)
        sf.write("tts_example.wav", speech.numpy(), samplerate=16000)
        display(Audio("tts_example.wav", autoplay=False))

# TTS from the first 5 (unique) questions from SQuaD 
# 1st Question - Which NFL team represented the AFC at Super Bowl 50?
# 2nd Question - Which NFL team represented the NFC at Super Bowl 50?
# 3rd Question - Where did Super Bowl 50 take place?
# 4th Question - Which NFL team won Super Bowl 50?
# 5th Question - What color was used to emphasize the 50th anniversary of the Super Bowl?
# 6th Question - What was the theme of Super Bowl 50?
audio_q1, _ = librosa.load("q1(1).wav")
audio_q1 = librosa.resample(audio_q1, orig_sr = 22050, target_sr=16000)
audio_q2, _ = librosa.load("q2.wav")
audio_q2 = librosa.resample(audio_q2, orig_sr = 22050, target_sr=16000)
audio_q3, _ = librosa.load("q3(1).wav")
audio_q3 = librosa.resample(audio_q3, orig_sr = 22050, target_sr=16000)
audio_q4, _ = librosa.load("q4.wav")
audio_q4 = librosa.resample(audio_q4, orig_sr = 22050, target_sr=16000)
audio_q5, _ = librosa.load("q5.wav")
audio_q5 = librosa.resample(audio_q5, orig_sr = 22050, target_sr=16000)
audio_names = [audio_q1, audio_q2, audio_q3, audio_q4, audio_q5]

for audio in audio_names:
    for model_name in model_names:
        print(f"Current model is {model_name}...")
        inputs = processor_whisper(audio, return_tensors="pt")
        input_features = inputs.input_features
        generated_ids = model_whisper.generate(inputs=input_features)
        transcription = processor_whisper.batch_decode(generated_ids, skip_special_tokens=True)[0]
        print("Question is - ", transcription)
        rator = pipeline('question-answering', model=model_name)
        result = rator({'question': transcription, 'context': transcription})
        print("Answer from model is - ", result["answer"])
        answer_input = processor_T5(text=result["answer"], return_tensors="pt")
        speaker_embeddings = torch.zeros((1, 512))
        speech = model_T5.generate_speech(answer_input["input_ids"], speaker_embeddings, vocoder=vocoder)
        sf.write("tts_example.wav", speech.numpy(), samplerate=16000)
        display(Audio("tts_example.wav", autoplay=False))

    

It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.
It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.


Question is -   which NFL team represented the AFC at Super Bowl 15.
Question is -   which NFL team won Super Bowl 50.


# Main problem

Students are tasked with joining together the speech recognition, language understanding and generation, and text-to-speech models, in order to build a conversational spoken question answering approach.

* The method should take as input speech utterances with questions.
* The language understanding and generation component should use as input a transcription for the current speech utterance, and also transcriptions from previous speech utterances (i.e., the conversation context).
* The language understanding and generation component can explore different strategies for improving answer quality:
  * Prompting the language model with (retrieved) in-context examples.
  * Using parameter-efficient fine-ting with existing conversational question answering datasets (e.g., [the CoQA dataset](https://stanfordnlp.github.io/coqa/), available from HuggingFace datasets).
  * ...
* The text-to-speech component takes as input the results from language generation, and produces a speech output.
* Both the automated speech recognition and the text-to-speech components can explore different approaches, although students should attempt to justify their choices (e.g., if changing the automated speech recognition component, show that it achieves a lower WER).
* Collect small audio samples, with your own voice, for the first instance in the CoQA testing split, and show the results produced by your method for this example.


In [105]:
# Add your solutions to the exercises
# ACESS CoQA DATASET
# !wget https://nlp.stanford.edu/data/coqa/coqa-train-v1.0.json
# !wget https://nlp.stanford.edu/data/coqa/coqa-dev-v1.0.json
# !wget https://nlp.stanford.edu/data/coqa/coqa-eval-v1.0.json
# Collect small audio samples, with your own voice, for the first instance in the CoQA testing split, and show the results produced by your method for this example
ds = load_dataset("coqa", split="train")
ds = ds.filter(lambda x: len(x["answers"]["answer_start"]) == 1)
ds = ds.filter(lambda x: x["answers"]["answer_start"][0] < 100)
ds = ds.filter(lambda x: x["answers"]["answer_start"][0] > 0)
ds = ds.filter(lambda x: len(x["questions"]) == 1)
print(ds[0].keys())  # Print the top-level keys of the first example
print(ds[0]["answers"].keys())  # Print the keys of the "answers" dictionary
print(ds[4]["questions"])  # Print the keys of the "questions" dictionary"])
print(ds[0]["answers"]["input_text"])  # Print the keys of the "questions" dictionary"])
print(ds[0]["answers"]["answer_start"])  # Print the keys of the "questions" dictionary"])
print(ds[0]["answers"]["answer_end"])  # Print the keys of the "questions" dictionary"])
starting_index = ds[0]["answers"]["answer_start"]
ending_index = ds[0]["answers"]["answer_end"]
string = ds[0]["story"]
string = string[starting_index[0]-1:ending_index[0]]
print(ds[0]["story"])
print(string)
# Top 5 CoQA questions
# 1st Question - What is the name of the movie?
# 2nd Question - Who wanted to build a bridge?
# 3rd Question - Why is a man from the Detroit area being tried?
# 4th Question - Who left the house?
# 5th Question - Who is Michael Scolfield?
audio_CoQA_1, _ = librosa.load("CoQA_1.wav")
audio_CoQA_1 = librosa.resample(audio_CoQA_1, orig_sr = 22050, target_sr=16000)
audio_CoQA_2, _ = librosa.load("CoQA_2.wav")
audio_CoQA_2 = librosa.resample(audio_CoQA_2, orig_sr = 22050, target_sr=16000)
audio_CoQA_3, _ = librosa.load("CoQA_3.wav")
audio_CoQA_3 = librosa.resample(audio_CoQA_3, orig_sr = 22050, target_sr=16000)
audio_CoQA_4, _ = librosa.load("CoQA_4.wav")
audio_CoQA_4 = librosa.resample(audio_CoQA_4, orig_sr = 22050, target_sr=16000)
audio_CoQA_5, _ = librosa.load("CoQA_5.wav")
audio_CoQA_5 = librosa.resample(audio_CoQA_5, orig_sr = 22050, target_sr=16000)
audio_names = [audio_CoQA_1, audio_CoQA_2, audio_CoQA_3, audio_CoQA_4, audio_CoQA_5]
for audio in audio_names:
    inputs = processor_whisper(audio, return_tensors="pt")
    input_features = inputs.input_features
    generated_ids = model_whisper.generate(inputs=input_features)
    transcription = processor_whisper.batch_decode(generated_ids, skip_special_tokens=True)[0]
    print("Question is - ", transcription)





Found cached dataset coqa (C:/Users/afons/.cache/huggingface/datasets/coqa/default/1.0.0/1b03a32914e882ed315577005c472665e542419f910bab445815ad1929a7958f)
Loading cached processed dataset at C:\Users\afons\.cache\huggingface\datasets\coqa\default\1.0.0\1b03a32914e882ed315577005c472665e542419f910bab445815ad1929a7958f\cache-3539b31ecb0fc300.arrow
Loading cached processed dataset at C:\Users\afons\.cache\huggingface\datasets\coqa\default\1.0.0\1b03a32914e882ed315577005c472665e542419f910bab445815ad1929a7958f\cache-b0d06394c70f4933.arrow
Loading cached processed dataset at C:\Users\afons\.cache\huggingface\datasets\coqa\default\1.0.0\1b03a32914e882ed315577005c472665e542419f910bab445815ad1929a7958f\cache-c60e053fc507f8d7.arrow
Loading cached processed dataset at C:\Users\afons\.cache\huggingface\datasets\coqa\default\1.0.0\1b03a32914e882ed315577005c472665e542419f910bab445815ad1929a7958f\cache-753637689047f3bb.arrow
It is strongly recommended to pass the `sampling_rate` argument to this funct

dict_keys(['source', 'story', 'questions', 'answers'])
dict_keys(['input_text', 'answer_start', 'answer_end'])
['Who is Michael Scofield?']
['Identity Thief is the first mentioned']
[13]
[26]
(EW.com) -- Identity Thief (CinemaScore: B) fared even better than expected, bringing in $36.6 million over the weekend across 3,141 theaters. For comparison, Melissa McCarthy's last major film Bridesmaids (though it was in a supporting role) opened at $26.2 million, in 2,918 theaters. With an opening like this, big things are surely expected from Seth Gordon's R-rated comedy which has already surpassed its $35 million production budget. Though Bateman and Gordon had a successful run with Horrible Bosses after a $28.3 million opening weekend in July 2011, Bateman hasn't had this kind of luck with most of his starring roles. Universal's The Change-Up (with Ryan Reynolds) opened at $13.5 million in August 2011 and went on to gross only $37.1 million domestically, on a $52 million production budget. 

It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.


Question is -   What is the name of the movie?


It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.


Question is -   Who wanted to build a bridge?


It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.


Question is -   Why is a man from the Detroit area being tried?


It is strongly recommended to pass the `sampling_rate` argument to this function. Failing to do so can result in silent errors that might be hard to debug.


Question is -   Who left the house?
Question is -   Who is Michael Scofield?
