### Text summarization using Bert

In [7]:
from transformers import BartConfig, BartModel

# Initializing a BART facebook/bart-large style configuration
configuration = BartConfig()

# Initializing a model (with random weights) from the facebook/bart-large style configuration
model = BartModel(configuration)

# Accessing the model configuration
configuration = model.config

In [9]:
from transformers import AutoTokenizer, BartForConditionalGeneration
import Phase1.recipe_parser as rp
import os
import pickle

if os.path.exists('../pickle_files/recipe_descs.pkl'):
    descs = pickle.load(open('../pickle_files/recipe_descs.pkl', 'rb'))
else:
    descs = rp.get_recipe_descs()

model = BartForConditionalGeneration.from_pretrained("facebook/bart-large-cnn")
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")

print(descs[1])


ARTICLE_TO_SUMMARIZE = (
    "PG&E stated it scheduled the blackouts in response to forecasts for high winds "
    "amid dry conditions. The aim is to reduce the risk of wildfires. Nearly 800 thousand customers were "
    "scheduled to be affected by the shutoffs which were expected to last through at least midday tomorrow."
)
inputs = tokenizer(descs[2], max_length=1024, return_tensors="pt")

# Generate Summary
summary_ids = model.generate(inputs["input_ids"], num_beams=2, min_length=0, max_length=20)
tokenizer.batch_decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


Spread it on sandwiches, toss it with pasta, or treat yourself a single happy spoonful, but definitely absolutely positively make pesto any chance you get.


'Corn tortillas are made with just two ingredients: masa harina and water.'

## GPT-2 + TTS

In [11]:
from transformers import pipeline, set_seed
from TTS.api import TTS

count = 0

### GPT2

In [18]:
generator = pipeline('text-generation', model='gpt2-large')
set_seed(42)
recipe = "Pesto Pasta"
text = "The historical curiosity about " + recipe + " is"
Generated_Text = generator(text, max_length=100, num_return_sequences=1)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


### TTS

In [19]:
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2", gpu=False)


tts.tts_to_file(text=Generated_Text[0]['generated_text'],
                file_path="../audio_files/output_"+recipe+".wav",
                speaker_wav="../audio_files/Sample en.wav",
                language="en")

count +=1


 > tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded.
 > Using model: xtts
 > Text splitted to sentences.
['The historical curiosity about Pesto Pasta is more than matched by the desire to know more about it, to figure out how and why we like it.', "Today's blog contains links to two books that tell its story and describe what made it famous:", 'The History of Pasta Pasta', 'And', 'The History of Baking Pasta', 'The History of Pesto Pasta was written by an engineer named Victor Dibbell.', 'A professor emeritus at Brown University']
 > Processing time: 70.91373085975647
 > Real-time factor: 2.2924159729094296
