In [None]:
# default_exp data.summarization

In [None]:
#hide
%reload_ext autoreload
%autoreload 2
%matplotlib inline

# data.summarization

> This module contains the bits required to use the fastai DataBlock API and/or mid-level data processing pipelines to organize your data for summarization tasks using architectures like BART and T5.

In [None]:
#export
import ast
from functools import reduce

import torch
from transformers import *
from fastai.text.all import *

from blurr.utils import *
from blurr.data.core import *

In [None]:
#hide
import pdb

from nbdev.showdoc import *
from fastcore.test import *

In [None]:
#cuda
torch.cuda.set_device(1)
print(f'Using GPU #{torch.cuda.current_device()}: {torch.cuda.get_device_name()}')

Using GPU #1: GeForce GTX 1080 Ti


## Summarization tokenization, batch transform, and DataBlock methods

Summarization tasks attempt to generate a human-understandable and sensible representation of a larger body of text (e.g., capture the meaning of a larger document in 1-3 sentences).

In [None]:
path = Path('./')
cnndm_df = pd.read_csv(path/'cnndm_sample.csv'); len(cnndm_df)

1000

In [None]:
cnndm_df.head(2)

Unnamed: 0,article,highlights,ds_type
0,"(CNN) -- Globalization washes like a flood over the world's cultures and economies. Floods can be destructive; however, they can also bring blessings, as the annual floods of the Nile did for ancient Egypt. The world's great universities can be crucial instruments in shaping, in a positive way, humankind's reaction to globalization and the development of humankind itself. Traditionally, universities have been defined and limited by location, creating an academic community and drawing students and scholars to that place. Eventually, some universities began to encourage students to study el...","John Sexton: Traditionally, universities have been defined and limited by location .\nGlobal campuses form a network of thought, innovation, he writes .\nFaculty can teach, Sexton says, students can team up in many cities at once .\nSexton: Research, scholarship can be shared and cultural ties made in ""century of knowledge""",train
1,"(CNN) -- Armenian President Robert Kocharian declared a state of emergency Saturday night after a day of clashes between police and protesters, a spokeswoman for the Armenian Foreign Ministry said. Opposition supporters wave an Armenian flag during a protest rally in Yerevan, Armenia, on Saturday. The protesters claim last month's presidential election was rigged. The state of emergency will ""hopefully bring some order"" to the capital, Yerevan, said Salpi Ghazarian, assistant to the Armenian foreign minister, who spoke to CNN early Sunday. The state of emergency could last until March 20, ...","NEW: Protest moves after crackdown at Freedom Square .\nOrder sought after protests over last month's election turn violent .\nDemonstrators say the election was fraudulent .\nState of emergency could last until March 20, official says .",train


In [None]:
pretrained_model_name = "facebook/bart-large-cnn"

hf_arch, hf_config, hf_tokenizer, hf_model = BLURR_MODEL_HELPER.get_hf_objects(pretrained_model_name, 
                                                                               model_cls=BartForConditionalGeneration)

hf_arch, type(hf_tokenizer), type(hf_config), type(hf_model)

('bart',
 transformers.tokenization_bart.BartTokenizer,
 transformers.configuration_bart.BartConfig,
 transformers.modeling_bart.BartForConditionalGeneration)

In [None]:
#export
class HF_SummarizationInput(list): pass

We create a subclass of `HF_BatchTransform` for summarization tasks to add `decoder_input_ids` and `labels` to our inputs during training, which will in turn allow the huggingface model to calculate the loss for us.  See [here](https://huggingface.co/transformers/model_doc/bart.html#transformers.BartModel.forward) for more information on these additional inputs are used in summarization and conversational training tasks.  

Note also that `labels` is simply target_ids shifted to the right by one since the task to is to predict the next token based on the current (and all previous) `decoder_input_ids`.

And lastly, we also update our targets to just be the `input_ids` of our target sequence so that fastai's `Learner.show_results` works (again, almost all the fastai bits require returning a single tensor to work).

In [None]:
#export
class HF_SummarizationBatchTransform(HF_BatchTransform):
    def __init__(self, hf_arch, hf_tokenizer, **kwargs):
        super().__init__(hf_arch, hf_tokenizer, HF_SummarizationInput, **kwargs)
        
    def encodes(self, samples):  
        samples = super().encodes(samples)
        if (len(samples[0]) == 1): return samples
        
        updated_samples = []
        for s in samples:
            s[0]['decoder_input_ids'] = s[1]['input_ids'][:-1].clone()
            s[0]['labels'] = s[1]['input_ids'][1:].clone()
            s[0]['labels'][s[0]['labels'] == self.hf_tokenizer.pad_token_id] = -100
            
            targ_ids = s[1]['input_ids']
            
            updated_samples.append((s[0], targ_ids))
        
        return updated_samples
    
    def decodes(self, encoded_samples):
        if (isinstance(encoded_samples, dict)): return self.hf_input_return_type([encoded_samples['input_ids']])
        return [encoded_samples]

We had to override the `decodes` method above because, while both our inputs and targets are technically the same things, we update the later to consist of *only* the target input_ids so that methods like `Learner.show_results` work.  Nevertheless, because fastai remembers what they are, `HF_TokenizerTransform.decodes` will be called for both and it works on a `list` of input_ids.

In [None]:
hf_batch_tfm = HF_SummarizationBatchTransform(hf_arch, hf_tokenizer)

blocks = ( 
    HF_TextBlock(hf_arch, hf_tokenizer), 
    HF_TextBlock(hf_arch, hf_tokenizer, hf_batch_tfm=hf_batch_tfm, max_length=150, hf_input_idxs=[0,1])
)

dblock = DataBlock(blocks=blocks, 
                   get_x=ColReader('article'), 
                   get_y=ColReader('highlights'), 
                   splitter=RandomSplitter())

In [None]:
# dblock.summary(cnndm_df)

In [None]:
dls = dblock.dataloaders(cnndm_df, bs=4)

In [None]:
b = dls.one_batch()

In [None]:
len(b), b[0]['input_ids'].shape, b[1].shape

(2, torch.Size([4, 512]), torch.Size([4, 74]))

In [None]:
#export
@typedispatch
def show_batch(x:HF_SummarizationInput, y, samples, dataloaders=None, ctxs=None, max_n=6, **kwargs):  
    res = L([ (s[0], s[1]) for s in samples ])          
    display_df(pd.DataFrame(res, columns=['text', 'target'])[:max_n])
    return ctxs

In [None]:
dls.show_batch(dataloaders=dls, max_n=2)

Unnamed: 0,text,target
0,"(CNN) -- Longtime talk show host Larry King says he's joined an effort to buy the Los Angeles Dodgers. ""It would be a thrill of a lifetime to be a part owner, a partial owner, of a team I grew up rooting for as a child in Brooklyn,"" the former host of CNN's ""Larry King Live"" said Wednesday."" ""To go to a ballpark and have an owner's box, to even have a say in a possible trade -- are you out of your mind?"" he asked rhetorically. King says he's part of group of investors interested in acquiring the franchise, despite its apparent financial troubles and unresolved contract issues with Fox Sports. Major League Baseball, which took charge of the team in April, has been embroiled in legal battles over future media rights after baseball Commissioner Bud Selig rejected a $3 billion television deal with Fox. The beleaguered club then filed for bankruptcy in June and has since drawn a number of high-profile buyers into the bidding process after team owner Frank McCourt agreed to sell. A court hearing over the Dodgers' future media rights is scheduled for December 7. King's investor group, meanwhile, is led by insurance agent Dennis Gilbert, who also works as a special assistant to Chicago White Sox Chairman Jerry Reinsdorf. ""What bigger thrill?"" asked King, a native of Brooklyn, New York, which the Dodgers once called home. The team, formerly known as the Trolley Dodgers because of the maze of trolley cars that Brooklynites once dodged in the streets, eventually shortened its name, then and moved to California, kicking off its first L.A. season in 1958, to the dismay of many New Yorkers. ""The emotional part would be that they'd have to carry me out,"" King said of his possible part-ownership stake in the team.","King says he's bidding as a part of an investor group.\nThe former CNN talk show host is a native of Brooklyn, where the Dodgers once played.\nThe club filed for bankruptcy in June, and owner Frank McCourt agreed to sell."
1,"(CNN) -- Days after a 10-year-old girl was snatched from her bedroom in the middle of the night -- but found alive nearly 12 hours later and a few miles away -- Los Angeles police pleaded Saturday for the public's help tracking down those responsible. Though they have said they believe two men were involved in the kidnapping, by Saturday police had only identified one by name as a ""wanted suspect"": 30-year-old Tobias Dustin Summers. In a press conference a few hours before Summers' image and information came out, Los Angeles Police Department spokesman Andrew Smith said that police believe this was a ""stranger abduction."" At the same time, he said, investigators are not ruling anything out -- such as whether the abductors had something to do with her family or that acquaintances of hers might have been involved. ""Until we can find these individuals that perpetrated this, we won't know if this was a random case, or whether it was targeted against this family or this child for any particular reason,"" Smith said. ""So, right now, we don't know."" Around 1 a.m. Wednesday, the girl's mother last saw her safe in bed in their home in the Los Angeles neighborhood of Northridge. The mother heard noises at about 3:30 a.m. and checked on her daughter again, discovering that she was missing. That set off an intensive search for the girl that ended about 2:50 p.m. Wednesday, when she was found about five miles southwest of her home. ""A Good Samaritan... directed the girl to some police officers nearby, and she was... transported to a local hospital, where she received treatment,"" Cmdr. Andrew Smith, a Los Angeles Police Department spokesman, told reporters Saturday afternoon. By Saturday, the 10-year-old was back home with her parents, with police on site to provide security and protect the family's privacy. Some 20 members of the police department's robbery and homicide division are working the case, Smith said. In a first preliminary interview with investigators, the girl said she had been put in several vehicles during her 12-hour ordeal, driving to places around the San Fernando Valley. Police were able to locate at least one of those locations -- a storage facility about two miles from the girl's home -- Smith said. The LAPD spokesman did not detail what happened to the girl in that time, urging media to do the same so as not to ""further traumatize... that poor girl."" Smith said that authorities don't have any reason","A 10-year-old girl was taken in the middle of the night from her L.A. home, police say.\nShe was found about 12 hours later, some 5 miles from her home.\nL.A. police release information about 1 of the 2 suspects in the case."


## Tests

The tests below to ensure the core DataBlock code above works for **all** pretrained summarization models available in huggingface.  These tests are excluded from the CI workflow because of how long they would take to run and the amount of data that would be required to download.

**Note**: Feel free to modify the code below to test whatever pretrained summarization models you are working with ... and if any of your pretrained summarization models fail, please submit a github issue *(or a PR if you'd like to fix it yourself)*

In [None]:
BLURR_MODEL_HELPER.get_models(task='ConditionalGeneration')

[transformers.modeling_bart.BartForConditionalGeneration,
 transformers.modeling_mbart.MBartForConditionalGeneration,
 transformers.modeling_pegasus.PegasusForConditionalGeneration,
 transformers.modeling_t5.T5ForConditionalGeneration]

In [None]:
pretrained_model_names = [
    ('facebook/bart-base',BartForConditionalGeneration),
    ('sshleifer/tiny-mbart', MBartForConditionalGeneration),
    ('google/pegasus-cnn_dailymail', PegasusForConditionalGeneration),
    ('t5-small', T5ForConditionalGeneration)
]

In [None]:
path = Path('./')
cnndm_df = pd.read_csv(path/'cnndm_sample.csv')

In [None]:
#slow
#hide_output
task = HF_TASKS_ALL.ConditionalGeneration
bsz = 2

test_results = []
for model_name, model_cls in pretrained_model_names:
    error=None
    
    print(f'=== {model_name} ===\n')
    
    hf_arch, hf_config, hf_tokenizer, hf_model = BLURR_MODEL_HELPER.get_hf_objects(model_name, 
                                                                                   task=task,
                                                                                   model_cls=model_cls)
    
    print(f'architecture:\t{hf_arch}\ntokenizer:\t{type(hf_tokenizer).__name__}\n')
    
    hf_batch_tfm = HF_SummarizationBatchTransform(hf_arch, hf_tokenizer)

    blocks = ( 
        HF_TextBlock(hf_arch, hf_tokenizer, padding='max_length', max_length=256), 
        HF_TextBlock(hf_arch, hf_tokenizer, hf_batch_tfm=hf_batch_tfm, padding='max_length', max_length=50, 
                     hf_input_idxs=[0,1])
    )

    dblock = DataBlock(blocks=blocks, 
                       get_x=ColReader('article'), 
                       get_y=ColReader('highlights'), 
                       splitter=RandomSplitter())

    dls = dblock.dataloaders(cnndm_df, bs=bsz) 
    b = dls.one_batch()
    
    try:
        print('*** TESTING DataLoaders ***\n')
        test_eq(len(b), 2)
        test_eq(len(b[0]['input_ids']), bsz)
        test_eq(b[0]['input_ids'].shape, torch.Size([bsz, 256]))
        test_eq(len(b[1]), bsz)
        test_eq(b[1].shape, torch.Size([bsz,50]))

        if (hasattr(hf_tokenizer, 'add_prefix_space')):
            test_eq(dls.tfms[0].kwargs['add_prefix_space'], True)
            
        test_results.append((hf_arch, type(hf_tokenizer).__name__, model_name, 'PASSED', ''))
        dls.show_batch(dataloaders=dls, max_n=2)
        
    except Exception as err:
        test_results.append((hf_arch, type(hf_tokenizer).__name__, model_name, 'FAILED', err))

=== facebook/bart-base ===

architecture:	bart
tokenizer:	BartTokenizer

*** TESTING DataLoaders ***



Unnamed: 0,text,target
0,"(EW.com) -- The MPAA has come under some flack of late for its one-size-fits-all rating system and vague-at-best explanations for those ratings. But there's a fun flip-side to the murkiness: Speculating on what those ratings and their explanations might infer about the movie in question â€” in this case, ""The Dark Knight Rises."" The MPAA handed a PG-13 rating today to ""The Dark Knight Rises,"" for ""intense sequences of violence and action, some sensuality and language."" The rating itself does not mean the movie is totally done -- films often screen well before the director is finished with technical elements like visual effects, sound design, and color timing. But it does provide us with a tantalizing indication for what may be in store with a wildly anticipated film that has otherwise put a high premium on plot details. Namely: Language? Sensuality? Intriguing! As a point of comparison, 2008â€²s ""The Dark Knight"" won its PG-13 for ""intense sequences of violence and some menace""; 2005â€²s ""Batman Begins"" was a PG-13 due to ""intense action violence, disturbing","The MPAA handed a PG-13 rating today to ""The Dark Knight Rises""\nThe rating is for ""intense sequences of violence and action, some sensuality and language""\nIt does provide us with a tantalizing indication"
1,"(CNN) -- Trail Life USA, the group that was launched after the Boys Scouts voted to allow gay members, held its inaugural convention over the weekend in Nashville. More than 1,000 people attended, including former Arkansas Gov. Mike Huckabee. The group, which will officially launch in 2014, says it expects to become a ""premier"" Christian organization for boys and young men. At the weekend event, which reporters were not allowed to attend, the group unveiled its name and logo. The logo includes the words adventure, character and leadership. ""Trail Life USA will be inclusive of boys, regardless of religion, race, national origin or socioeconomic status, and accept boys who are experiencing same-sex attractions or gender confusion,"" a statement from the group says. ""However, it will not admit youth who are open or avowed about their homosexuality, and it will not admit boys who are not 'biologically male' or boys who wish to dress and act like girls."" Boy Scout vote. In May, more than 60% of 1,400-member national council of the BSA voted to allow openly gay youths to join scouting. The change takes effect January 1. The BSA, however, will maintain its ban on gay adult leaders. The National Jewish","More than 1,000 attend group's inaugural national convention.\nGroup's goal is to become a ""premier"" Christian organization for boys and young men.\nIn May, the Boys Scouts of America voted to allow openly gay youths to"


=== sshleifer/tiny-mbart ===



HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1081.0, style=ProgressStyle(description…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=5069051.0, style=ProgressStyle(descript…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=150.0, style=ProgressStyle(description_…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=26.0, style=ProgressStyle(description_w…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1528827.0, style=ProgressStyle(descript…


architecture:	mbart
tokenizer:	MBartTokenizer

*** TESTING DataLoaders ***



Unnamed: 0,text,target
0,"Istanbul (CNN) -- Turkey's judicial system faced an uproar this week after one of the country's highest courts upheld a decision to reduce sentences against 26 men convicted of having sex with a 13-year old girl. Public outrage stemmed from a court ruling that the 13-year old girl had willingly engaged in ""consensual"" sexual relations with the 26 men. Among the growing chorus of critics was Turkey's President Abdullah Gul. ""I take particular care not to make any direct statements on issues that are in the judicial process,"" Gul wrote in a series of statements on his Twitter account on Friday. ""[But] the decision about reducing the punishment related with what happened to a young child of ours made me deeply uncomfortable... there is still the possibility for an appeal. I am hoping for an outcome that will comfort the public conscience."" The case in question dates back to 2002, when 26 men from the southeastern Turkish town of Mardin were accused of repeatedly having sex with a 13-year old girl identified only by the initials ""N.C."" According to Turkish media reports, the","Sentences for 26 men convicted of having sex with a 13-year-old were reduced. The ruling said the girl had engaged in ""consensual"" sex. Turkey's president said the ruling"
1,"(CNN) -- After a severe earthquake centered in Pakistan's Kashmir province killed more than 70,000 people in 2005, teams from a nonprofit architecture group based in London, England, helped the region start to rebuild. The group, Article 25, worked with local craftspeople to develop a design that could withstand earthquakes and trained them to build the structures. That experience may provide some lessons for the rebuilding of Haiti, where Article 25 is also planning to help with reconstruction, according to Robin Cross, an architect who is the organization's director of projects. As in Haiti, at least some of the death and injury in Pakistan stemmed from local building methods. ""The important point is that it isn't generally earthquakes that kill people,"" Cross said. ""It's generally buildings that kill people. Building design is a way to solve that problem."" In Pakistan, Article 25 worked with local craftspeople to determine the best way to build structures that could withstand quakes and then helped train people to build them. ""By the time we built 80 to 100 buildings and we pulled out, we were leaving not just buildings, but also a capacity to","Earthquake centered in Kashmir region of Pakistan killed more than 70,000 in 2005. London-based nonprofit provided architectural help to start rebuilding. Robin Cross says lightweight framing was used to build earthquake"


=== google/pegasus-cnn_dailymail ===



HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1101.0, style=ProgressStyle(description…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1912529.0, style=ProgressStyle(descript…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=65.0, style=ProgressStyle(description_w…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=88.0, style=ProgressStyle(description_w…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=2275327883.0, style=ProgressStyle(descr…


architecture:	pegasus
tokenizer:	PegasusTokenizer

*** TESTING DataLoaders ***



Unnamed: 0,text,target
0,"Los Angeles (CNN) -- Lindsay Lohan has one less legal problem to worry about after a prosecutor decided not to charge her in connection with an altercation while she was in a substance abuse rehab program in December. Lohan, who faces a preliminary hearing on a felony grand theft charge April 22, could also go back to jail on a probation violation charge on the same day. But the Riverside County, California, district attorney decided Tuesday not to pursue a possible assault charge against the actress for a December 12, 2010, incident with Dawn Holland, a Betty Ford Center staffer, the prosecutor's spokesman said. ""Our office has completed review of the case, and we will not file charges due to insufficient evidence,"" spokesman John Hall said. Lohan checked herself into the Betty Ford Center in Rancho Mirage, California, for substance abuse rehab on September 28, 2010, just days after she dodged jail on another probation violation. A Los Angeles County judge later ordered her to remain in the drug rehab program until January 3 for failing a drug test while on supervised probation for a 2007 drunken-driving charge. An incident three weeks after she was released from rehab led to her latest legal problems. She allegedly walked out of a Venice, California, jewelry store wearing a necklace that she had not paid for, according to police. Lohan rejected","County district attorney will not charge Lohan in Betty Ford Center incident. She was investigated for the December 12, 2010, incident with a staffer. Lohan still faces a felony theft charge in Los Angeles."
1,"Fashion Week has begun in New York, but for space enthusiasts, the most exciting glamor shots are coming from Mars. The much-celebrated rover Curiosity has so far strutted 109 meters (358 feet) on the surface of the Red Planet, according to its odometer, and she's looking great, NASA scientists say. The 2000-pound SUV-sized rover has been on the surface of Mars for about one month, and operating as expected. ""There have been no significant anomalies or wild cards thrown in where the performance on Mars differed significantly from the Earth,"" said Michael Watkins, Curiosity mission manager at NASA's Jet Propulsion Laboratory at a news briefing Thursday. ""That's a real testament to the engineers that developed the system."" A new photo from the rover's camera on the mast shows off the rover's arm against the spectacular Martian landscape. On the arm is the MAHLI camera, with resolution so great that it can resolve down to the grain of talcum powder, said Aileen Yingst at Thursday's NASA news briefing. Yingst is the deputy principal investigator for Curiosity's Mars Hand Lens Imager at the Planetary Science Institute, Tucson. All that separated this camera's lens from the Martian environment was a dust cover, which appears to",Scientists aren't seeing major 'wild cards' in rover performance. Dust cover on high-resolution camera appears to be intact. Sampling of Martian material may begin in about a month. Curiosity has been on Mars since August 6


=== t5-small ===



HBox(children=(FloatProgress(value=0.0, description='Downloading', max=1197.0, style=ProgressStyle(description…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=242065649.0, style=ProgressStyle(descri…


architecture:	t5
tokenizer:	T5Tokenizer

*** TESTING DataLoaders ***



Unnamed: 0,text,target
0,"A California poultry producer announced Monday that it is working with the federal health officials after an estimated 278 illnesses were reported in 18 states. Raw chicken products from Foster Farms plants have been identified as the likely source of this outbreak of Salmonella Heidelberg. Illnesses were linked to the facility through investigations conducted by local, state, and federal officials. The outbreak is continuing and no recall has been issued. The Food Safety and Inspection Service (FSIS), an agency of the U.S. Department of Agriculture, has been unable so far to identify the specific product or production period, but raw products from the potentially affected facilities bear one of the following numbers on the packaging: P6137, P6137A, P7632 and mainly distributed to retail outlets in California, Oregon and Washington state. Food poisoning: What you need to know. The Centers for Disease Control and Prevention is partnering with state health departments to monitor the outbreak while FSIS continues its investigation, but due to the government shutdown, current information may not be available on the agencies' websites. ""While the company, FSIS and CDC continue to investigate the issue, Foster Farms has instituted a number of additional food safety practices, processes and technology",Raw chicken products from Foster Farms plants have been ID'd as the likely source. Federal inspectors have not yet isolated the specific product or production period. An estimated 278 illnesses have been reported in 18 states. Affected
1,"(CNN) -- The massive earthquake and tsunami that hit Japan on Friday spared the island of Indonesia, a nearby developing country that was devastated in 2004 by one of the deadliest tsunamis in history. Helping after that disaster was CNN Hero Robin Lim. Lim is the founder of Indonesia's Yayasan Bumi Sehat health clinics, which provide free prenatal and birthing care to women in need. Lim recently spoke with CNN's Ebonne Ruffins about how her midwife teams respond in natural disasters, balancing critical medical needs with cultural traditions. Ebonne Ruffins: What's it like to work in Indonesia and, specifically, after the 2004 tsunami? Robin Lim: In Indonesia, there is a great need for us because many mothers want the help of professional birth attendants but can't afford them. So they come to us. It was not easy to start so grass-roots in Bali, completely reliant on donations, but we did it. And when the tsunami happened a few years later in Aceh, we were early responders. So many on the eastern coast lost everything, and it was very difficult to witness. But we had to be there","Robin Lim founded health clinics in Indonesia that offer free prenatal and birthing care. When a tsunami hit in 2004, her midwife teams had to work with limited resources. Lim: We try to have respect for everyone's"


## Cleanup

In [None]:
#hide
from nbdev.export import notebook2script
notebook2script()

Converted 00_utils.ipynb.
Converted 01_data-core.ipynb.
Converted 01a_data-token-classification.ipynb.
Converted 01b_data-question-answering.ipynb.
Converted 01e_data-summarization.ipynb.
Converted 01z_data-language-modeling.ipynb.
Converted 02_modeling-core.ipynb.
Converted 02a_modeling-token-classification.ipynb.
Converted 02b_modeling-question-answering.ipynb.
Converted 02e_modeling-summarization.ipynb.
Converted 02z_modeling-language-modeling.ipynb.
Converted index.ipynb.
