## News Personalization: Leveraging RAG for Targeted Content Delivery

## Setup

In [2]:
!pip install --upgrade pip

[0m

In [4]:
!pip install -qU \
    kaggle \
    sagemaker \
    pinecone-client==2.2.1 \
    ipywidgets==7.0.0\
    seaborn\
    sentence-transformers\
    torch==1.13.1 \
    transformers==4.27.2

[0m

## 1. Extracting the archived news dataset from Kaggle

In [4]:
#!pip install --q kaggle 

[0m

[Note: According to kaggle api documentation the location where credentials json is looking for is ~/.kaggle/kaggle.json]

In [5]:
!mkdir ~/.kaggle
!touch ~/.kaggle/kaggle.json

mkdir: cannot create directory ‘/root/.kaggle’: File exists


In [6]:
api_token={"username":"abc","key":"1234"}

In [7]:
import json
import os

with open('/root/.kaggle/kaggle.json', 'w') as file:
    json.dump(api_token, file)

!chmod 600 ~/.kaggle/kaggle.json

In [10]:
!kaggle datasets download -d rmisra/news-category-dataset --unzip

Downloading news-category-dataset.zip to /root/RAG
 94%|███████████████████████████████████▊  | 25.0M/26.5M [00:01<00:00, 26.6MB/s]
100%|██████████████████████████████████████| 26.5M/26.5M [00:01<00:00, 22.6MB/s]


In [11]:
os.getcwd()+"/Data/News_Category_Dataset_v3.json"

'/root/RAG/Data/News_Category_Dataset_v3.json'

### READ IN THE DATASET

In [52]:
import os
import pandas as pd

df = pd.read_json(os.getcwd()+"/Data/News_Category_Dataset_v3.json",
                 lines=True)
print(df.shape)
df

(209527, 6)


Unnamed: 0,link,headline,category,short_description,authors,date
0,https://www.huffpost.com/entry/covid-boosters-...,Over 4 Million Americans Roll Up Sleeves For O...,U.S. NEWS,Health experts said it is too early to predict...,"Carla K. Johnson, AP",2022-09-23
1,https://www.huffpost.com/entry/american-airlin...,"American Airlines Flyer Charged, Banned For Li...",U.S. NEWS,He was subdued by passengers and crew when he ...,Mary Papenfuss,2022-09-23
2,https://www.huffpost.com/entry/funniest-tweets...,23 Of The Funniest Tweets About Cats And Dogs ...,COMEDY,"""Until you have a dog you don't understand wha...",Elyse Wanshel,2022-09-23
3,https://www.huffpost.com/entry/funniest-parent...,The Funniest Tweets From Parents This Week (Se...,PARENTING,"""Accidentally put grown-up toothpaste on my to...",Caroline Bologna,2022-09-23
4,https://www.huffpost.com/entry/amy-cooper-lose...,Woman Who Called Cops On Black Bird-Watcher Lo...,U.S. NEWS,Amy Cooper accused investment firm Franklin Te...,Nina Golgowski,2022-09-22
...,...,...,...,...,...,...
209522,https://www.huffingtonpost.com/entry/rim-ceo-t...,RIM CEO Thorsten Heins' 'Significant' Plans Fo...,TECH,Verizon Wireless and AT&T are already promotin...,"Reuters, Reuters",2012-01-28
209523,https://www.huffingtonpost.com/entry/maria-sha...,Maria Sharapova Stunned By Victoria Azarenka I...,SPORTS,"Afterward, Azarenka, more effusive with the pr...",,2012-01-28
209524,https://www.huffingtonpost.com/entry/super-bow...,"Giants Over Patriots, Jets Over Colts Among M...",SPORTS,"Leading up to Super Bowl XLVI, the most talked...",,2012-01-28
209525,https://www.huffingtonpost.com/entry/aldon-smi...,Aldon Smith Arrested: 49ers Linebacker Busted ...,SPORTS,CORRECTION: An earlier version of this story i...,,2012-01-28


In [55]:
df.groupby('category').agg(_num_articles=('headline','count'),
                           _min_dt=('date','min'),
                          _max_dt=('date','max')).reset_index().sort_values('_num_articles',ascending=False)[:5]

Unnamed: 0,category,_num_articles,_min_dt,_max_dt
24,POLITICS,35602,2014-04-18,2022-09-19
38,WELLNESS,17945,2012-01-28,2022-08-30
10,ENTERTAINMENT,17362,2012-01-28,2022-09-20
34,TRAVEL,9900,2012-01-28,2022-04-07
30,STYLE & BEAUTY,9814,2012-01-28,2022-08-08


## 2.Encode and upsert the data into a Vector database. I am using Pinecone for this exercise.

### Setup - Sentence encoder

In [6]:
!pip install sentence-transformers

[0m

In [7]:
from sentence_transformers import SentenceTransformer

  from .autonotebook import tqdm as notebook_tqdm


In [8]:
model_name='sentence-transformers/all-MiniLM-L6-v2'
encoder = SentenceTransformer(model_name_or_path=model_name)

### checking sentence encoding
sentences = ["This is an example sentence", "Each sentence is converted"]

embeddings = encoder.encode(sentences)
print(f"Number of sentences embedded = {len(embeddings)}")
print(f"Length of emberddings embedded = {len((embeddings[0]))}")
print(f"First 10 elements of the embedding =\n {embeddings[0][:10]}")

Number of sentences embedded = 2
Length of emberddings embedded = 384
First 10 elements of the embedding =
 [ 0.0676569   0.06349594  0.04871309  0.07930495  0.03744809  0.00265276
  0.03937498 -0.00709845  0.05936136  0.03153702]


In [59]:
df_subset_test=df.loc[0:1000]

In [65]:
def generate_item_sentence(item: "pd.Series", 
                           text_columns:"list of columns") -> str:
    """
    
    This function concatenates columns of interest and generates sentence embeddings of the concatenated text.
    
    Args: 
        item (pd.Series): row of a pandas dataframe
        text_columns (list): list of columns'
    Returns:
         str: concatenated string
    """
    return ' '.join([item[column] for column in text_columns])

In [66]:
df_subset_test["sentence"] = df_subset_test.apply(lambda row: generate_item_sentence(row,["headline","short_description"]), 
                                                  axis=1)

df_subset_test["sentence_embedding"] = df_subset_test["sentence"].apply(encoder.encode)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_subset_test["sentence"] = df_subset_test.apply(lambda row: generate_item_sentence(row,["headline","short_description"]),
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_subset_test["sentence_embedding"] = df_subset_test["sentence"].apply(encoder.encode)


In [67]:
generate_item_sentence.__annotations__
help(generate_item_sentence)

Help on function generate_item_sentence in module __main__:

generate_item_sentence(item: 'pd.Series', text_columns: 'list of columns') -> str
    This function concatenates columns of interest and generates sentence embeddings of the concatenated text.
    
    Args: 
        item (pd.Series): row of a pandas dataframe
        text_columns (list): list of columns'
    Returns:
         str: concatenated string



### Push vectors to Pinecone database

#### PINECONE SETUP

In [9]:
import pinecone
import os

PINECONE_API_KEY="234324dsfdsfs"
os.environ["PINECONE_API_KEY"] = PINECONE_API_KEY
os.environ["PINECONE_API_ENV"] = "gcp-starter"

pinecone.init(
    api_key = os.environ.get('PINECONE_API_KEY'),
    environment = os.environ.get('PINECONE_API_ENV')
)

#listing all the indexes
pinecone.list_indexes()

['news-articles-rag-aws']

In [57]:
import time

index_name = 'news-articles-rag-aws'

if index_name in pinecone.list_indexes():
    pinecone.delete_index(index_name)

#Index deleted 
print(pinecone.list_indexes())
    
pinecone.create_index(
    name=index_name,
    dimension=encoder.get_sentence_embedding_dimension(),
    metric='cosine'
)
# wait for index to finish initialization
while not pinecone.describe_index(index_name).status['ready']:
    time.sleep(1)

[]


In [29]:
#Checking the indexes creation:
pinecone.list_indexes()

['news-articles-rag-aws']

### Upsert data into Pinecone

In [59]:
from tqdm.auto import tqdm

batch_size = 2  # can increase but needs larger instance size otherwise instance runs out of memory
vector_limit = df_subset_test.shape[0]#1000

answers = df_subset_test[:vector_limit]
index = pinecone.Index(index_name)

for i in tqdm(range(0, len(answers), batch_size)):
    # find end of batch
    i_end = min(i+batch_size, len(answers))
    # create IDs batch
    ids = [str(x) for x in range(i, i_end)]
    if i%100==0:
        print(f"i = {i}, i_end = {i_end}, ids = {ids}")
    # create metadata batch
    metadatas = [{'text': text} for text in answers["sentence"][i:i_end]]
    #print("--------Metadata----------")
    #print(metadatas)
    # create embeddings
    texts = answers["sentence"][i:i_end].tolist()
    #print("--------Texts----------")
    #print(texts)
    #print("--------Embedding----------")
    embeddings=[encoder.encode(sent).tolist() for sent in texts]
    #print(f"Length of embeddings = {len(embeddings)}")
    #embeddings = embed_docs(texts)
    #df_subset_test["sentence"].apply(encoder.encode)
    # create records list for upsert
    #print("---------Records------------")
    records = zip(ids, embeddings, metadatas)
    #print(f"records = {records}")
    # upsert to Pinecone
    index.upsert(vectors=records)

  0%|          | 1/501 [00:00<01:39,  5.01it/s]

i = 0, i_end = 2, ids = ['0', '1']


 10%|█         | 51/501 [00:07<01:01,  7.30it/s]

i = 100, i_end = 102, ids = ['100', '101']


 20%|██        | 101/501 [00:14<00:54,  7.29it/s]

i = 200, i_end = 202, ids = ['200', '201']


 30%|███       | 151/501 [00:20<00:49,  7.14it/s]

i = 300, i_end = 302, ids = ['300', '301']


 40%|████      | 201/501 [00:28<00:43,  6.82it/s]

i = 400, i_end = 402, ids = ['400', '401']


 50%|█████     | 251/501 [00:35<00:33,  7.44it/s]

i = 500, i_end = 502, ids = ['500', '501']


 60%|██████    | 301/501 [00:42<00:28,  7.14it/s]

i = 600, i_end = 602, ids = ['600', '601']


 70%|███████   | 351/501 [00:49<00:20,  7.32it/s]

i = 700, i_end = 702, ids = ['700', '701']


 80%|████████  | 401/501 [00:55<00:14,  7.04it/s]

i = 800, i_end = 802, ids = ['800', '801']


 90%|█████████ | 451/501 [01:02<00:07,  7.00it/s]

i = 900, i_end = 902, ids = ['900', '901']


100%|██████████| 501/501 [01:09<00:00,  7.19it/s]

i = 1000, i_end = 1001, ids = ['1000']





In [68]:
# check number of records in the index
index = pinecone.Index(index_name)
index.describe_index_stats()

{'dimension': 384,
 'index_fullness': 0.01001,
 'namespaces': {'': {'vector_count': 1001}},
 'total_vector_count': 1001}

## 3. Leveraging RAG for Targeted Content Delivery

### LLM Setup

In [10]:
from transformers import AutoModelForSeq2SeqLM
from transformers import AutoTokenizer
from transformers import GenerationConfig
import torch
import pandas as pd

model_name='google/flan-t5-base'

model_flan = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer_flan = AutoTokenizer.from_pretrained(model_name, use_fast=True)

### RAG Setup

In [11]:
from sentence_transformers import SentenceTransformer

### checking sentence encoding

model_name='sentence-transformers/all-MiniLM-L6-v2'
encoder = SentenceTransformer(model_name_or_path=model_name)

In [12]:
import pinecone
import os

PINECONE_API_KEY="2343242adasda"
os.environ["PINECONE_API_KEY"] = PINECONE_API_KEY
os.environ["PINECONE_API_ENV"] = "gcp-starter"

pinecone.init(
    api_key = os.environ.get('PINECONE_API_KEY'),
    environment = os.environ.get('PINECONE_API_ENV')
)

#listing all the indexes
pinecone.list_indexes()

['news-articles-rag-aws']

In [13]:
# check number of records in the index
index_name='news-articles-rag-aws'
index = pinecone.Index(index_name)
index.describe_index_stats()

{'dimension': 384,
 'index_fullness': 0.01001,
 'namespaces': {'': {'vector_count': 1001}},
 'total_vector_count': 1001}

### Helper Functions

In [28]:
def retriver (query: str,top_k: int) -> list[str]:
    """
    This function retrieves the relevant articles from the Pinecone vector database.
    
    Args: 
         query:str: User query
         top_k: int: Top K responses to return
    Returns:
         list[str]: List of relevant top 5 articles
    """
    
    query_vector = encoder.encode(query_text).tolist()

    res = index.query(query_vector, top_k=5, include_metadata=True)

    # show the results
    #res

    contexts = [match.metadata['text'] for match in res.matches]
    return contexts

In [29]:
retriver.__annotations__
help(retriver)

Help on function retriver in module __main__:

retriver(query: str, top_k: int) -> list[str]
    This function retrieves the relevant articles from the Pinecone vector database.
    
    Args: 
         query:str: User query
         top_k: int: Top K responses to return
    Returns:
         list[str]: List of relevant top 5 articles



In [30]:
from typing import List



def construct_context(contexts: List[str],max_section_len: int,separator: str) -> str:
    
    """
    This function generates the context string from RAG response.
    
    Args: 
         contexts: List[str]: RAG semantic search response
         max_section_len: int: Max length of the context
        separator: str: Seperator between the responses ('/s','/n')
    Returns:
         str: concatenated string
    """
    
    chosen_sections = []
    chosen_sections_len = 0

    for text in contexts:
        text = text.strip()
        # Add contexts until we run out of space.
        chosen_sections_len += len(text) + 2
        if chosen_sections_len > max_section_len:
            break
        chosen_sections.append(text)
    concatenated_doc = separator.join(chosen_sections)
    '''print(
        f"With maximum sequence length {max_section_len}, selected top {len(chosen_sections)} document sections: \n{concatenated_doc}"
    )'''
    return concatenated_doc

In [31]:
construct_context.__annotations__
help(construct_context)

Help on function construct_context in module __main__:

construct_context(contexts: List[str], max_section_len: int, separator: str) -> str
    This function generates the context string from RAG response.
    
    Args: 
         contexts: List[str]: RAG semantic search response
         max_section_len: int: Max length of the context
        separator: str: Seperator between the responses ('/s','/n')
    Returns:
         str: concatenated string



In [38]:
def construct_payload(prompt_template: str,
                      question: str,
                      context_str: str,
                      padding:str="longest")-> str:
    
    """
    This function contructs the prompt for the LLM.
    
    Args: 
        prompt_template: str: Input prompt template
        question:str: LLM question
        context_str: LLM input context information
        max_source_length:int: max source length
        max_target_length:int:=round(max_source_length/2,0)
        padding:str="longest"
    Returns:
         str: LLM prompt
    """
    prompt = prompt_template.replace("{context}", context_str).replace("{question}", question)

    return prompt
    

In [39]:
construct_payload.__annotations__
help(construct_payload)

Help on function construct_payload in module __main__:

construct_payload(prompt_template: str, question: str, context_str: str, padding: str = 'longest') -> str
    This function contructs the prompt for the LLM.
    
    Args: 
        prompt_template: str: Input prompt template
        question:str: LLM question
        context_str: LLM input context information
        max_source_length:int: max source length
        max_target_length:int:=round(max_source_length/2,0)
        padding:str="longest"
    Returns:
         str: LLM prompt



In [40]:
prompt_template = """Answer the following QUESTION without hallucination.".

CONTEXT:
{context}

QUESTION:
{question}

ANSWER:
"""

### LLM Base response without RAG

In [55]:
max_source_length=512
max_target_length=round(max_source_length/2,0)
padding="longest"#"max_length" #"longest"
#input_text=example.text[0]

question= "What is the news about air travel?"#"Summarize the text without any hallucination:"#"Who are the entities:" #"What is the sentiment:" #"Provide accurate summarization:"
#Summary

context_str=""

prompt=construct_payload(prompt_template,question,context_str,padding="longest")

#print(prompt)

inputs = tokenizer_flan(prompt,
    max_length=max_source_length,
    return_tensors='pt',
    padding=padding,
    truncation=True)

#max_target_length=max(round(len_input_text/2,0),max_target_length)
#print(f"\nLENGTH OF INPUT TEXT = {len(prompt)}, max_target_length = {max_target_length}")
base_output=tokenizer_flan.decode(model_flan.generate(inputs["input_ids"]
                                                      ,max_new_tokens=max_target_length)[0])
base_output=base_output.replace("<pad> ", "").replace("</s>", "")

print(f'LLM RESPONSE WITHOUT RAG CONTEXT:\n{base_output}')

LLM RESPONSE WITHOUT RAG CONTEXT:
Air travel is a form of transportation that is often referred to as "air travel" or "air travel" in the United States.


### LLM response with RAG - FLAN-T5-Base

In [102]:
max_source_length=512
max_target_length=1000#max_source_length
padding="longest"#"max_length" #"longest"
#input_text=example.text[0]

question= "What is the news about air travel?"#"Summarize the text without any hallucination:"#"Who are the entities:" #"What is the sentiment:" #"Provide accurate summarization:"

#Generate the contexts
contexts=retriver(query = question,top_k=5)
print(f"{contexts=}")

#Construct the context
context_str = construct_context(contexts=contexts,max_section_len = 2000,separator = '\n')#"\n")
print(f"\n{context_str=}")


#Create the prompt
prompt=construct_payload(prompt_template,question,context_str,padding="longest")
print(f"\n{prompt=}")


inputs = tokenizer_flan(prompt,
    max_length=max_source_length,
    return_tensors='pt',
    padding=padding,
    truncation=True)

rag_output=tokenizer_flan.decode(model_flan.generate(inputs["input_ids"],max_new_tokens=max_target_length)[0])
rag_output=rag_output.replace("<pad> ", "").replace("</s>", "")

print(f'\n\nLLM RESPONSE WITH RAG CONTEXT:\n{rag_output}')


contexts=["Jen Psaki Says Court Ruling Ending Mask Mandate For Travel Is 'Disappointing' The White House press secretary said the CDC continues to recommend mask-wearing on airplanes, even as carriers dropped mask requirements.", 'Alaska Airlines Cancels Dozens Of Flights As Pilots Picket More than 100 Alaska Airlines flights were canceled by the airline, including 66 in Seattle, 20 in Portland, Oregon, 10 in Los Angeles and seven in San Francisco.', "American Airlines Flyer Charged, Banned For Life After Punching Flight Attendant On Video He was subdued by passengers and crew when he fled to the back of the aircraft after the confrontation, according to the U.S. attorney's office in Los Angeles.", '78,000 Pounds Of Infant Formula Arrives In US From Europe The Air Force flew pallets of baby formula to Indiana to begin alleviating the devastating nationwide shortage.', 'What’s Going On With HBO Max? Here’s What We Know So Far. Speculation about potentially drastic cuts at HBO Max has le

### LLM response with RAG - FLAN-T5-Small

In [103]:
tokenizer_flan_small = AutoTokenizer.from_pretrained("google/flan-t5-small")
model_flan_small = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small")

tokenizer_config.json: 100%|██████████| 2.54k/2.54k [00:00<00:00, 8.79MB/s]
spiece.model: 100%|██████████| 792k/792k [00:00<00:00, 37.5MB/s]
tokenizer.json: 100%|██████████| 2.42M/2.42M [00:00<00:00, 29.0MB/s]
special_tokens_map.json: 100%|██████████| 2.20k/2.20k [00:00<00:00, 11.5MB/s]
config.json: 100%|██████████| 1.40k/1.40k [00:00<00:00, 7.24MB/s]
pytorch_model.bin: 100%|██████████| 308M/308M [00:00<00:00, 388MB/s] 
generation_config.json: 100%|██████████| 147/147 [00:00<00:00, 749kB/s]


In [105]:
max_source_length=512
max_target_length=1000#max_source_length
padding="longest"#"max_length" #"longest"
#input_text=example.text[0]

question= "What is the news about air travel?"#"Summarize the text without any hallucination:"#"Who are the entities:" #"What is the sentiment:" #"Provide accurate summarization:"

#Generate the contexts
contexts=retriver(query = question,top_k=5)
print(f"{contexts=}")

#Construct the context
context_str = construct_context(contexts=contexts,max_section_len = 2000,separator = '\n')#"\n")
print(f"\n{context_str=}")


#Create the prompt
prompt=construct_payload(prompt_template,question,context_str,padding="longest")
print(f"\n{prompt=}")


inputs = tokenizer_flan_small(prompt,
    max_length=max_source_length,
    return_tensors='pt',
    padding=padding,
    truncation=True)

rag_output=tokenizer_flan_small.decode(model_flan_small.generate(inputs["input_ids"],max_new_tokens=max_target_length)[0])
#rag_output=rag_output.replace("<pad> ", "").replace("</s>", "")

print(f'\n\nLLM RESPONSE WITH RAG CONTEXT:\n{rag_output}')


contexts=["Jen Psaki Says Court Ruling Ending Mask Mandate For Travel Is 'Disappointing' The White House press secretary said the CDC continues to recommend mask-wearing on airplanes, even as carriers dropped mask requirements.", 'Alaska Airlines Cancels Dozens Of Flights As Pilots Picket More than 100 Alaska Airlines flights were canceled by the airline, including 66 in Seattle, 20 in Portland, Oregon, 10 in Los Angeles and seven in San Francisco.', "American Airlines Flyer Charged, Banned For Life After Punching Flight Attendant On Video He was subdued by passengers and crew when he fled to the back of the aircraft after the confrontation, according to the U.S. attorney's office in Los Angeles.", '78,000 Pounds Of Infant Formula Arrives In US From Europe The Air Force flew pallets of baby formula to Indiana to begin alleviating the devastating nationwide shortage.', 'What’s Going On With HBO Max? Here’s What We Know So Far. Speculation about potentially drastic cuts at HBO Max has le

### LLM response with RAG - Falconsai/text_summarization

In [97]:
tokenizer_falcon = AutoTokenizer.from_pretrained("Falconsai/text_summarization")
model_falcon = AutoModelForSeq2SeqLM.from_pretrained("Falconsai/text_summarization")

In [101]:
from transformers import pipeline

max_source_length=2000
max_target_length=20000

#Generate the contexts
contexts=retriver(query = question,top_k=5)
print(f"{contexts=}")

#Construct the context
context_str = construct_context(contexts=contexts,
                                max_section_len = 2000,separator = '\n')#"\n")
print(f"\n{context_str=}")

summarizer = pipeline("summarization", model="Falconsai/text_summarization")

rag_sumamrizer=summarizer(context_str, max_length=len(context_str)+10, min_length=30, do_sample=False)[0]['summary_text']

print(f"\n\nLLM RESPONSE WITH RAG CONTEXT USING Falconsai/text_summarization =\n{rag_sumamrizer}")

contexts=["Jen Psaki Says Court Ruling Ending Mask Mandate For Travel Is 'Disappointing' The White House press secretary said the CDC continues to recommend mask-wearing on airplanes, even as carriers dropped mask requirements.", 'Alaska Airlines Cancels Dozens Of Flights As Pilots Picket More than 100 Alaska Airlines flights were canceled by the airline, including 66 in Seattle, 20 in Portland, Oregon, 10 in Los Angeles and seven in San Francisco.', "American Airlines Flyer Charged, Banned For Life After Punching Flight Attendant On Video He was subdued by passengers and crew when he fled to the back of the aircraft after the confrontation, according to the U.S. attorney's office in Los Angeles.", '78,000 Pounds Of Infant Formula Arrives In US From Europe The Air Force flew pallets of baby formula to Indiana to begin alleviating the devastating nationwide shortage.', 'What’s Going On With HBO Max? Here’s What We Know So Far. Speculation about potentially drastic cuts at HBO Max has le

Your max_length is set to 1067, but you input_length is only 246. You might consider decreasing max_length manually, e.g. summarizer('...', max_length=123)




LLM RESPONSE WITH RAG CONTEXT USING Falconsai/text_summarization =
Jen Psaki says Court ruling on Ending Mask Mandate For Travel Is 'Disappointing' The White House says the CDC continues to recommend mask-wearing on airplanes . Alaska Airlines Cancels Dozens Of Flights As Pilots Picket More than 100 Alaska Airlines flights were canceled by the airline .
