In this project, we apply a powerful approach known as Universal Language Model Fine-tuning (ULMFiT). We’ll start from a language model that’s already been trained on large amounts of Wikipedia text, then adapt it (“fine-tune”) to the AG News corpus. After that, we’ll build a text classifier on top of our freshly fine-tuned encoder. The result is a model that can assign news articles into one of four categories—World, Sports, Business, or Sci/Tech.

## Preplanning
Before we dive in, let’s outline our overall plan:

* **Objective:** Create an NLP model that classifies AG News articles, taking advantage of an existing Wikipedia-pretrained model.
* **Dataset:** Use the AG News corpus, which features four main classes of news stories.
* **Strategy:**
    + Fine-tune the Wikipedia-trained language model on AG News text. This helps the model learn domain-specific nuances (news vocabulary, style, etc.).
    + Use the fine-tuned model as an encoder for classification. Train a classifier head on top of it to predict among four news categories.
* **Tools:** We’ll rely on the fastai library’s text APIs, which streamline ULMFiT, letting us focus on the conceptual side.

With that frame in mind, let’s start coding!

Here, we download and extract the AG News corpus. The dataset conveniently arrives in CSV format with clearly designated train and test files.

In [None]:
from fastai.text.all import *
path = untar_data(URLs.AG_NEWS)

In [3]:
path.ls()

(#4) [Path('/root/.fastai/data/ag_news_csv/test.csv'),Path('/root/.fastai/data/ag_news_csv/readme.txt'),Path('/root/.fastai/data/ag_news_csv/classes.txt'),Path('/root/.fastai/data/ag_news_csv/train.csv')]

In [4]:
!cat /root/.fastai/data/ag_news_csv/readme.txt

AG's News Topic Classification Dataset

Version 3, Updated 09/09/2015


ORIGIN

AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000  news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine which has been running since July, 2004. The dataset is provided by the academic comunity for research purposes in data mining (clustering, classification, etc), information retrieval (ranking, search, etc), xml, data compression, data streaming, and any other non-commercial activity. For more information, please refer to the link http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html .

The AG's news topic classification dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the dataset above. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in 

In [5]:
!cat /root/.fastai/data/ag_news_csv/classes.txt

World
Sports
Business
Sci/Tech


# Training the Language Model
## Inspect the Dataset and Get the Class Labels

Because the data is in CSV files, we’ll read them using pandas. We combine train/test to create a “grand corpus” that we can use to fine-tune the language model. The variable `df_lm` now contains the text from both splits— perfect for language modeling.

In [6]:
col_names = ["class_index", "title", "description"]
df_train = pd.read_csv(path/'train.csv', header=None, names=col_names, low_memory=False)
df_valid = pd.read_csv(path/'test.csv', header=None, names=col_names, low_memory=False)
df_lm = pd.concat([df_train, df_valid], axis=0, ignore_index=True)

In [7]:
dls_lm = TextDataLoaders.from_df(df_lm, path=path, text_col=(1,2), label_col=0, is_lm=True, seed=42, shuffle=False)

In [8]:
dls_lm.train.show_batch()

Unnamed: 0,text,text_
0,"xxbos xxfld 1 xxup fg xxmaj holds xxmaj talks with xxmaj niger xxmaj delta xxmaj militants xxfld 2 xxmaj the xxmaj federal xxmaj government yesterday revealed that security agencies are holding talks in xxmaj abuja with the leadership of the xxmaj niger xxmaj delta xxmaj peoples xxmaj volunteer xxmaj force ( xxunk ) , led by xxmaj xxunk xxmaj asari xxmaj dokubo , over the continued unrest in the xxmaj niger xxmaj","xxfld 1 xxup fg xxmaj holds xxmaj talks with xxmaj niger xxmaj delta xxmaj militants xxfld 2 xxmaj the xxmaj federal xxmaj government yesterday revealed that security agencies are holding talks in xxmaj abuja with the leadership of the xxmaj niger xxmaj delta xxmaj peoples xxmaj volunteer xxmaj force ( xxunk ) , led by xxmaj xxunk xxmaj asari xxmaj dokubo , over the continued unrest in the xxmaj niger xxmaj delta"
1,"xxmaj street xxmaj global xxmaj advisors , xxmaj timothy xxup b. xxmaj xxunk , died of a heart attack on xxmaj tuesday night , the company said yesterday . xxmaj he was 53 . xxbos xxfld 1 xxmaj texas not popular choice xxfld 2 xxmaj the xxmaj breeders ' xxmaj cup is making its first visit to xxmaj texas and not everyone thinks the world of it . xxmaj there has been","street xxmaj global xxmaj advisors , xxmaj timothy xxup b. xxmaj xxunk , died of a heart attack on xxmaj tuesday night , the company said yesterday . xxmaj he was 53 . xxbos xxfld 1 xxmaj texas not popular choice xxfld 2 xxmaj the xxmaj breeders ' xxmaj cup is making its first visit to xxmaj texas and not everyone thinks the world of it . xxmaj there has been grumbling"
2,"the body to grow its own bypasses , at first in the legs and , if that works , perhaps later in the heart … xxbos xxfld 1 xxup eu : xxmaj coke xxup eu anti - trust case settlement closer xxfld 2 coca - cola xxmaj xxunk xxmaj bottling xxmaj company has announced developments in the long - running xxup eu investigation into anti - competitive practices . xxbos xxfld 1","body to grow its own bypasses , at first in the legs and , if that works , perhaps later in the heart … xxbos xxfld 1 xxup eu : xxmaj coke xxup eu anti - trust case settlement closer xxfld 2 coca - cola xxmaj xxunk xxmaj bottling xxmaj company has announced developments in the long - running xxup eu investigation into anti - competitive practices . xxbos xxfld 1 xxmaj"
3,"believes xxmaj dennis xxmaj rommedahl 's sparkling winner will provide the platform for the xxmaj dane to recapture the form that made him one of xxmaj europe 's most feared wingers . xxbos xxfld 1 xxmaj business as usual for xxup ata despite bankruptcy filing xxfld 2 xxmaj although the future of xxup ata xxmaj airlines xxmaj inc . remains up in the air , planes still were landing xxmaj wednesday at","xxmaj dennis xxmaj rommedahl 's sparkling winner will provide the platform for the xxmaj dane to recapture the form that made him one of xxmaj europe 's most feared wingers . xxbos xxfld 1 xxmaj business as usual for xxup ata despite bankruptcy filing xxfld 2 xxmaj although the future of xxup ata xxmaj airlines xxmaj inc . remains up in the air , planes still were landing xxmaj wednesday at the"
4,"of xxunk percent . xxmaj it is , state officials said , the lowest interest rate offered in more than 30 years . xxbos xxfld 1 xxup n.c . xxmaj state 's xxmaj hodge xxmaj puts xxmaj self xxmaj among xxmaj elite ( ap ) xxfld 2 xxup ap - xxmaj julius xxmaj hodge drove to the basket and scored , then turned to run back on defense . xxmaj on the","xxunk percent . xxmaj it is , state officials said , the lowest interest rate offered in more than 30 years . xxbos xxfld 1 xxup n.c . xxmaj state 's xxmaj hodge xxmaj puts xxmaj self xxmaj among xxmaj elite ( ap ) xxfld 2 xxup ap - xxmaj julius xxmaj hodge drove to the basket and scored , then turned to run back on defense . xxmaj on the way"
5,tuesday xxbos xxfld 1 xxmaj bryant 's xxmaj accuser xxmaj must xxmaj be xxmaj identified xxfld 2 xxup denver - a federal judge in xxmaj colorado has rejected a request from the woman accusing xxup nba star xxmaj kobe xxmaj bryant of rape to remain anonymous in her civil lawsuit . xxbos xxfld 1 xxmaj sox and bonds xxfld 2 xxmaj there are games in which players renew their vows as teammates,"xxbos xxfld 1 xxmaj bryant 's xxmaj accuser xxmaj must xxmaj be xxmaj identified xxfld 2 xxup denver - a federal judge in xxmaj colorado has rejected a request from the woman accusing xxup nba star xxmaj kobe xxmaj bryant of rape to remain anonymous in her civil lawsuit . xxbos xxfld 1 xxmaj sox and bonds xxfld 2 xxmaj there are games in which players renew their vows as teammates ,"
6,"xxmaj boston to the 6 - 2 win in xxmaj game 2 . xxbos xxfld 1 xxup us xxmaj consumer xxmaj confidence xxmaj tumbles in xxmaj august xxfld 2 xxup new xxup york ( reuters ) - xxup u.s . consumer confidence fell sharply in xxmaj august , breaking four straight months of gains , as a slowdown in job creation and rising oil prices weighed on sentiment . xxbos xxfld 1","boston to the 6 - 2 win in xxmaj game 2 . xxbos xxfld 1 xxup us xxmaj consumer xxmaj confidence xxmaj tumbles in xxmaj august xxfld 2 xxup new xxup york ( reuters ) - xxup u.s . consumer confidence fell sharply in xxmaj august , breaking four straight months of gains , as a slowdown in job creation and rising oil prices weighed on sentiment . xxbos xxfld 1 temple"
7,"\ \ xxmaj early xxmaj internet surfers , which of course means something less than a decade ago for most of us , will remember the days when xxmaj netscape was the only real browser in town . xxmaj for those who do n't remember , there was a time when xxmaj netscape was for sale in … xxbos xxfld 1 xxmaj everton ' considering ' increased xxmaj rooney bid xxfld 2","\ xxmaj early xxmaj internet surfers , which of course means something less than a decade ago for most of us , will remember the days when xxmaj netscape was the only real browser in town . xxmaj for those who do n't remember , there was a time when xxmaj netscape was for sale in … xxbos xxfld 1 xxmaj everton ' considering ' increased xxmaj rooney bid xxfld 2 xxmaj"
8,"reuters ) xxfld 2 xxmaj reuters - xxmaj the xxmaj japanese government downgraded its \ view on the economy slightly on xxmaj tuesday , citing weaker exports \ and output , but it said a recovery was continuing due to steady \ domestic demand , in terms of both personal consumption and \ capital spending . "" the economy continues to recover , while \ some weak movements have been seen recently",") xxfld 2 xxmaj reuters - xxmaj the xxmaj japanese government downgraded its \ view on the economy slightly on xxmaj tuesday , citing weaker exports \ and output , but it said a recovery was continuing due to steady \ domestic demand , in terms of both personal consumption and \ capital spending . "" the economy continues to recover , while \ some weak movements have been seen recently ,"


Here we are grouping the title and description columns for the model's text input in a single feature. Note that the independent variable `text` and the dependent target `text_` are offset by a single token. Our fine-tuned language model will learn to predict the next token in the series based on the values in the stream `text`.

## Fine-Tune the Language Model
We'll base training on the classic AWD-LSTM model (as used in ULMFiT), initialize and train it. Note that we’ll do a quick pass first, then unfreeze more layers for deeper fine-tuning. Because language model is so large, we'll also start off training in mixed precision to save time and resources.

In [9]:
learn = language_model_learner(
    dls_lm, AWD_LSTM, drop_mult=0.3,
    metrics=[accuracy, Perplexity()]).to_fp16()
learn.fit_one_cycle(1, 2e-2)

epoch,train_loss,valid_loss,accuracy,perplexity,time
0,3.208871,3.068084,0.44809,21.500673,10:58


In [10]:
learn.save('1epoch')
learn = learn.load('1epoch')
learn.unfreeze()
learn.fit_one_cycle(10, 2e-3)
learn.save_encoder('finetuned')

epoch,train_loss,valid_loss,accuracy,perplexity,time
0,2.98661,2.879777,0.471676,17.810308,11:50
1,2.830561,2.799445,0.482623,16.435516,11:54
2,2.637185,2.740306,0.493215,15.491731,11:57
3,2.456956,2.715974,0.49963,15.119334,11:57
4,2.294566,2.704736,0.505257,14.950377,11:56
5,2.147095,2.69848,0.510063,14.857126,11:58
6,2.005627,2.695754,0.514727,14.816684,11:58
7,1.902904,2.703961,0.518032,14.938793,12:00
8,1.820011,2.705783,0.520096,14.966028,12:01
9,1.797798,2.743584,0.518972,15.542585,12:02


An accuracy of 52% might not sound high, but it's actually a really good value for a language model predicting the next word in a sentence. Common values for this on other datasets are also in the 40-50% range. Because the model took hours to train, we should save it now.

In [11]:
learn.save_encoder('finetuned')

Some important components to note from the code above: 

1. language_model_learner: Loads a pretrained LM (from Wikipedia), plus a final layer we’ll adapt to our new text domain.
2. fit_one_cycle: A training schedule that typically yields fast, stable convergence.
3. unfreeze(): Unlocks earlier layers, allowing the model to adjust them for our AG News text.
4. save_encoder('finetuned'): Saves only the encoder portion of the model. We’ll need this for classification.

## Sanity Check: Generate Some Text
Language models can have fun creative uses: we can sample from them to see if the fine-tuning “vocabulary” looks correct.

In [13]:
TEXT = "Netflix Launches New"
N_WORDS = 40
N_SENTENCES = 2
preds = [learn.predict(TEXT, N_WORDS, temperature=0.75)
         for _ in range(N_SENTENCES)] 
print("\n".join(preds))

Netflix Launches New DVD Rental Service ( ap ) xxfld 2 AP - The DVD rental business is offering a boost to a new DVD rental service , the Blu - ray Disc ,
Netflix Launches New Online Service ( ap ) xxfld 2 AP - Online video service Netflix Inc . on Monday said it was launching a service that lets users watch movies , music and games faster


If the output reads like plausible news text about Netflix, we know the domain adaptation is at least somewhat working.

# Build a Classifier on the Fine-Tuned Encoder
Now for the real objective: classifying AG News into four categories. We’ll feed the same text columns to a TextDataLoaders, but this time it’s not an LM. We also load the vocabulary from our language model so that the word embeddings match perfectly.


In [14]:
dls_clas = TextDataLoaders.from_df(df_train, path=path, text_col=(1,2), label_col=0, 
                                 is_lm=False, seed=42, shuffle=False, text_vocab=dls_lm.vocab)

dls_clas.show_batch(max_n=3)



Unnamed: 0,text,category
0,"xxbos xxfld 1 xxmaj kyoto is xxmaj dead - xxmaj long xxmaj live xxmaj pragmatism xxfld 2 xxmaj there 's troubling news ( ft subscription xxunk , alternate copy here ) coming from xxmaj japan , where the xxmaj kyoto protocol on xxmaj greenhouse xxmaj emissions was born in 1997 . xxmaj it seems that the xxmaj japanese are n't going to be able to meet their emissions targets specified in the agreement in time . xxmaj indeed , unless they buy a "" large quantity "" of emissions credits from other countries , they 're not going to be able to meet their commitment at all . xxmaj xxunk xxmaj sugiyama , a climate expert at the xxmaj central xxmaj research xxmaj institute of xxmaj electric xxmaj power xxmaj industry in xxmaj japan , said emissions were rising 1 per cent a year due to a larger - than",4
1,"xxbos xxfld 1 2004 xxup us xxmaj senate xxmaj outlook xxfld 2 xxmaj with all the hoopla over xxmaj bush and xxmaj kerry , some of you may not have been paying close attention to the other races going on in this loaded xxup us political season . xxmaj i 've read a good dozen or so xxmaj senate outlooks , and my blurry eyes and spinning brain kept getting lost in all the numbers and losing track of who , ultimately , was likely to control the xxmaj senate on xxmaj november third . xxmaj so i made my very own xxmaj senate outlook to figure it out ( or add further confusion , depending on what you think of my predictions ) . xxmaj the bad news is , we probably wo n't know who controls the xxmaj senate on xxmaj november third . xxmaj the good news",4
2,"xxbos xxfld 1 xxmaj the xxmaj rundown xxfld 2 4 xxmaj miami at xxup n.c . xxmaj state < xxunk p.m. , xxup espn < / em><br > think the xxmaj wolfpack is kicking itself for that loss two weeks ago at xxmaj north xxmaj carolina ? xxmaj you bet . xxmaj had xxup n.c . xxmaj state ( 4 - 2 , 3 - 1 xxup acc ) won that one , this would be for sole possession of first place in the xxup acc . xxmaj as it is , this is a chance for the xxmaj wolfpack to show it belongs in the upper echelon of the restructured league -- which , for now , is xxmaj miami , xxmaj florida xxmaj state , and a xxunk of also - rans . xxmaj the xxmaj wolfpack 's defense is the best in the nation against the pass",2


In [21]:
learn = text_classifier_learner(dls_clas, AWD_LSTM, drop_mult=0.5,
                                metrics=[accuracy, Perplexity()]).to_fp16()

In [16]:
learn.loss_func

FlattenedLoss of CrossEntropyLoss()

In [17]:
learn.opt

<fastai.optimizer.Optimizer at 0x7f8c5189b490>

In [22]:
learn = learn.load_encoder('finetuned')

1. `text_classifier_learner`: Creates a classification head on top of the `AWD-LSTM`.
2. `load_encoder('finetuned')`: Loads the encoder we fine-tuned on AG News text, transferring all that new domain knowledge.

## Discriminative Learning Rates and Gradual Unfreezing
A hallmark of ULMFiT is the notion that earlier layers need less aggressive updates than later ones. So we train in stages, unfreezing layer by layer and applying smaller LR to earlier layers.

In [23]:
learn.fit_one_cycle(1, 2e-2)
learn.freeze_to(-2)
learn.fit_one_cycle(1, slice(1e-2/(2.6**4),1e-2))
learn.freeze_to(-3)
learn.fit_one_cycle(1, slice(5e-3/(2.6**4),5e-3))
learn.unfreeze()
learn.fit_one_cycle(2, slice(1e-3/(2.6**4),1e-3))

epoch,train_loss,valid_loss,accuracy,perplexity,time
0,0.728121,0.365699,0.879958,1.441521,01:41


epoch,train_loss,valid_loss,accuracy,perplexity,time
0,0.735102,0.360706,0.8985,1.434342,01:55


epoch,train_loss,valid_loss,accuracy,perplexity,time
0,0.777047,0.286184,0.919708,1.331338,02:50


epoch,train_loss,valid_loss,accuracy,perplexity,time
0,0.480853,0.31752,0.887042,1.373717,03:42
1,0.777151,0.304685,0.920875,1.356198,03:42


Each training session here expands the set of trainable layers until everything is fine-tuned together.

We can see how well the classifier is assigning categories by examining sample predictions.



In [24]:
learn.show_results(max_n=6)

Unnamed: 0,text,category,category_
0,"xxbos xxfld 1 xxmaj area xxmaj college xxmaj football xxmaj capsules xxfld 2 xxmaj navy at xxmaj tulsa < br > xxmaj where : xxmaj xxunk xxmaj stadium xxmaj when : 7 p.m. < br > xxmaj shooting for 3 - 0 : xxmaj navy is off to its first 2 - 0 start since 1996 . xxmaj the xxmaj midshipmen have n't started 3 - 0 since 1979 , when they won their first six games and finished 7 - 4 . xxmaj navy has started 3 - 0 only twice in the past 40 years -- the 1978 team won its first seven games . xxmaj tulsa , which improved from 1 - 11 in 2002 to 8 - 5 last season , the best turnaround in college football , has lost its first two games , 21 - 3 at xxmaj kansas and 38 - 21 at",2,2
1,"xxbos xxfld 1 xxmaj munch xxmaj theft xxmaj focuses on xxmaj museum xxmaj security xxfld 2 xxup oslo , xxmaj norway - xxmaj the brazen daylight theft of xxmaj edvard xxmaj munch 's renowned masterpiece "" the xxmaj scream "" left xxmaj norway 's police scrambling for clues and stirred a debate across xxmaj europe over how to protect art if thieves are willing to use deadly force to take it . xxmaj some expressed fears that works of art are in increasing danger from violent raids - unless , as xxmaj norway 's deputy culture minister put it , "" we lock them in a mountain bunker . "" xxmaj armed , masked robbers stormed into xxmaj oslo 's xxmaj munch xxmaj museum in broad daylight on xxmaj sunday , threatening an employee with a gun and terrifying patrons before they made off with a version of xxmaj munch",1,1
2,"xxbos xxfld 1 xxup xxunk xxmaj lands xxup faa xxmaj conversations xxfld 2 \ "" in response to its xxup xxunk request , xxup epic has received from the xxmaj federal xxmaj aviation \ xxmaj administration ( faa ) transcripts ( pdf ) and audio recordings concerning the \ request by the office of xxup us xxmaj house of xxmaj representatives xxmaj majority xxmaj leader xxmaj tom delay \ ( r - tx ) to the xxup faa regarding the xxmaj may 2003 search for the plane owned by xxmaj texas \ xxmaj state xxmaj representative xxmaj pete xxmaj xxunk ( tail xxmaj number xxup xxunk ) . "" \ "" the xxmaj may 12 , 2003 audio recording of telephone conversations between the faa 's \ xxmaj washington xxmaj operations xxmaj center and various xxup faa field employees clearly indicate \ that the xxup faa employees were misled into",4,3
3,"xxbos xxfld 1 mysql and xxup alter xxup table xxmaj guilty as xxmaj charged xxfld 2 \ \ xxmaj for the last few days xxmaj i 've been using xxunk xxup alter and xxup repair table \ functionality and its caused tons of countless problems and a great deal of lost \ sleep . \ \ xxmaj the first problem i noticed was that for large tables xxup alter xxup table was taking \ hours ! xxmaj lets say you have a 30 g table . xxmaj good luck altering it as the default \ mysql configuration will probably take 100 or more hours . \ \ xxmaj in xxunk defense there are a number of variables you can use to increase the \ performance of an xxup alter but the problem is that the two major ones \ ( xxunk , and xxunk ) ca n't be set at",4,4
4,"xxbos xxfld 1 xxmaj tressel xxmaj trailed by xxmaj allegations xxfld 2 xxmaj oh , if only the biggest problems in xxmaj columbus , xxmaj ohio , were how the xxmaj buckeyes might get their running game going and beat xxmaj purdue today . xxmaj not so . xxmaj in a pair of stories -- one in xxup espn the xxmaj magazine , the other on espn.com -- xxmaj ohio xxmaj state xxmaj coach xxmaj jim xxmaj tressel was first accused by former star running back xxmaj maurice xxmaj clarett of helping him gain access to free cars and of hooking him up with boosters for cash payments . xxmaj the second story traced such scams back to xxmaj tressel 's days as the coach at xxmaj youngstown xxmaj state , in xxmaj clarett 's home town . xxmaj ohio xxmaj state 's response to xxmaj clarett : xxmaj he",2,2
5,"xxbos xxfld 1 xxmaj skype dials up beta software for xxmaj mac xxup os x xxfld 2 xxmaj skype xxmaj technologies xxup sa , of xxmaj luxembourg , xxmaj tuesday released a beta version of its free xxmaj internet telephony software for xxmaj apple xxmaj computer xxmaj inc . 's xxmaj mac xxup os xxunk > advertisement < / p><p><img src=""http : / / ad.doubleclick.net / ad / idg.us.ifw.general / ibmpseries;sz=1x1;ord=200301151450 ? "" width=""1 "" height=""1 "" border=""0 "" / > < a href=""http : / / ad.doubleclick.net / clk;9824455;9690404;u?http : / / ad.doubleclick.net / clk;9473681;9688522;d?http : / / xxrep 3 w .ibm.com / servers / eserver / pseries / campaigns / boardroom / index.html?ca=pseries met = boardroom me = e p_creative = p_infow_rss"">introducing xxup ibm eserver p5 systems . < / a><br / > powered by ibms most advanced 64 - bit microprocessor ( power5(tm ) ) , p5",4,4


## Pushing Accuracy Further
Finally, we do a longer run of training to refine the classifier further:

In [25]:
learn.fit_one_cycle(12, slice(1e-3/(2.6**4),1e-3))

epoch,train_loss,valid_loss,accuracy,perplexity,time
0,0.519692,0.277178,0.916292,1.319401,03:42
1,0.410583,0.252542,0.920333,1.287294,03:42
2,0.382233,0.247313,0.924,1.28058,03:41
3,0.346298,0.247176,0.92475,1.280404,03:42
4,0.317382,0.251329,0.926708,1.285733,03:42
5,0.293783,0.269677,0.92525,1.309541,03:42
6,0.278939,0.263954,0.929875,1.302068,03:42
7,0.26641,0.263146,0.931,1.301017,03:42
8,0.259386,0.274693,0.932625,1.316127,03:41
9,0.272189,0.277544,0.935833,1.319884,03:42


This is typically where you might pick up a few extra percentage points of accuracy. 

In order to achieve State of the Art Results, we could continue onward now to also do a forward-backward ensemble approach. This involves training the model on reversed text, and making predictions based on an ensemble of the forward-trained model with the backward-trained model. For now we'll leave that to a future blog post.

# Conclusion
By the end of this process, we have:

* Fine-Tuned LM: A Wikipedia-trained language model adapted to the style of AG News articles.
* Classifier: A high-accuracy predictor that classifies new articles into four categories (World, Sports, Business, Sci/Tech).

This approach demonstrates the power of ULMFiT: by starting with a large generic language model, then carefully fine-tuning and unfreezing layers in stages, we leverage knowledge learned from Wikipedia, quickly adapt to AG News, and produce a high quality text classifier with relatively little data.

In short, it’s another testament to the idea that modern deep learning thrives on cleverly transferring prior knowledge. That’s how we build language-savvy systems with fewer resources and in less time—exactly the kind of practical magic that industry researchers love to see in action!

Go forth and classify the news (and beyond).