# In-Context Learning


In-context learning is a generalisation of few-shot learning where the LLM is provided a context as part of the prompt and asked to respond by utilising the information in the context.

* Example: *"Summarize this research article into one paragraph highlighting its strengths and weaknesses: [insert article text]”*
* Example: *"Extract all the quotes from this text and organize them in alphabetical order: [insert text]”*

A very popular technique that you will learn in week 5 called Retrieval-Augmented Generation (RAG) is a form of in-context learning, where:
* a search engine is used to retrieve some relevant information
* that information is then provided to the LLM as context


In this example we download some recent research papers from arXiv papers, extract the text from the PDF files and ask Gemini to summarize the articles as well as provide the main strengths and weaknesses of the papers. Finally we print the summaries to a local html file and as markdown.

In [None]:
!pip install requests bs4 google-generativeai pypdf



In [None]:
import os
import requests
from bs4 import BeautifulSoup
import google.generativeai as genai
from urllib.request import urlopen, urlretrieve
from IPython.display import Markdown, display
from pypdf import PdfReader
from datetime import date
from tqdm import tqdm
from google.colab import userdata

In [None]:
# API_KEY = os.environ.get("GEMINI_API_KEY")
# API_KEY = userdata.get('GEMINI_API_KEY')
API_KEY = "AIzaSyBxO0yk9YYBiutSWRogMXvffJsMCcNBKEY"
genai.configure(api_key=API_KEY)

We select those papers that have been featured in Hugging Face papers.

In [None]:
BASE_URL = "https://huggingface.co/papers"
page = requests.get(BASE_URL)
soup = BeautifulSoup(page.content, "html.parser")
h3s = soup.find_all("h3")

papers = []

for h3 in h3s:
    a = h3.find("a")
    title = a.text
    link = a["href"].replace('/papers', '')

    papers.append({"title": title, "url": f"https://arxiv.org/pdf{link}"})

Code to extract text from PDFs.

In [None]:
def extract_paper(url):
    html = urlopen(url).read()
    soup = BeautifulSoup(html, features="html.parser")

    # kill all script and style elements
    for script in soup(["script", "style"]):
        script.extract()    # rip it out

    # get text
    text = soup.get_text()

    # break into lines and remove leading and trailing space on each
    lines = (line.strip() for line in text.splitlines())
    # break multi-headlines into a line each
    chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
    # drop blank lines
    text = '\n'.join(chunk for chunk in chunks if chunk)

    return text


def extract_pdf(url):
    pdf = urlretrieve(url, "pdf_file.pdf")
    reader = PdfReader("pdf_file.pdf")
    text = ""
    for page in reader.pages:
        text += page.extract_text() + "\n"
    return text


def printmd(string):
    display(Markdown(string))

In [None]:
LLM = "gemini-2.5-flash"
model = genai.GenerativeModel(LLM)

We use Gemini to summarize the papers.

In [None]:
import time
for paper in tqdm(papers):
    time.sleep(10)
    try:
        prompt = (
            "Analyze this research article. "
            "Output the result solely as a Markdown table with two columns: 'Strengths' and 'Weaknesses'. "
            "List the key strengths and weaknesses in the respective columns. "
            + extract_pdf(paper["url"])
        )
        paper["summary"] = model.generate_content(prompt).text
    except:
        print("Generation failed")
        paper["summary"] = "Paper not available"

  4%|▍         | 1/25 [00:12<04:51, 12.14s/it]

Generation failed


  4%|▍         | 1/25 [00:17<06:55, 17.33s/it]


KeyboardInterrupt: 

We print the results to a html file.

In [None]:
!pip install markdown



In [None]:
import markdown

page = f"<html> <head> <style> table {{border-collapse: collapse; width: 100%;}} th, td {{text-align: left; padding: 8px; border: 1px solid #ddd;}} th {{background-color: #f2f2f2;}} </style> <h1>Daily Dose of AI Research</h1> <h4>{date.today()}</h4> <p><i>Summaries generated with: {LLM}</i></head><body>"

with open("papers.html", "w") as f:
    f.write(page)

for paper in papers:
    if paper["summary"] != "Paper not available":
        summary_html = markdown.markdown(paper["summary"], extensions=['tables'])
    else:
        summary_html = "<p>Paper not available</p>"

    page = f'<h2><a href="{paper["url"]}">{paper["title"]}</a></h2> {summary_html}'
    with open("papers.html", "a") as f:
        f.write(page)

end = "</body></html>"
with open("papers.html", "a") as f:
    f.write(end)

We can also print the results to this notebook as markdown.

In [None]:
for paper in papers:
    printmd("**[{}]({})**<br>{}<br><br>".format(paper["title"],
                                                paper["url"],
                                                paper["summary"]))

**[T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground](https://arxiv.org/pdf/2512.10430)**<br>Paper not available<br><br>

**[Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving](https://arxiv.org/pdf/2512.10739)**<br>Paper not available<br><br>

**[Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation](https://arxiv.org/pdf/2512.10949)**<br>Paper not available<br><br>

**[OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification](https://arxiv.org/pdf/2512.10756)**<br>Paper not available<br><br>

**[Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning](https://arxiv.org/pdf/2512.10534)**<br>Paper not available<br><br>

**[MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos](https://arxiv.org/pdf/2512.10881)**<br>Paper not available<br><br>

**[BEAVER: An Efficient Deterministic LLM Verifier](https://arxiv.org/pdf/2512.05439)**<br>Paper not available<br><br>

**[From Macro to Micro: Benchmarking Microscopic Spatial Intelligence on Molecules via Vision-Language Models](https://arxiv.org/pdf/2512.10867)**<br>Paper not available<br><br>

**[Thinking with Images via Self-Calling Agent](https://arxiv.org/pdf/2512.08511)**<br>Paper not available<br><br>

**[VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction](https://arxiv.org/pdf/2511.23386)**<br>Paper not available<br><br>

**[Stronger Normalization-Free Transformers](https://arxiv.org/pdf/2512.10938)**<br>Paper not available<br><br>

**[StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space](https://arxiv.org/pdf/2512.10959)**<br>Paper not available<br><br>

**[Evaluating Gemini Robotics Policies in a Veo World Simulator](https://arxiv.org/pdf/2512.10675)**<br>Paper not available<br><br>

**[MoRel: Long-Range Flicker-Free 4D Motion Modeling via Anchor Relay-based Bidirectional Blending with Hierarchical Densification](https://arxiv.org/pdf/2512.09270)**<br>Paper not available<br><br>

**[The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality](https://arxiv.org/pdf/2512.10791)**<br>Paper not available<br><br>

**[Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task](https://arxiv.org/pdf/2512.10359)**<br>Paper not available<br><br>

**[ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning](https://arxiv.org/pdf/2512.09924)**<br>Paper not available<br><br>

**[H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos](https://arxiv.org/pdf/2512.09406)**<br>Paper not available<br><br>

**[Fed-SE: Federated Self-Evolution for Privacy-Constrained Multi-Environment LLM Agents](https://arxiv.org/pdf/2512.08870)**<br>Paper not available<br><br>

**[Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization](https://arxiv.org/pdf/2512.10955)**<br>Paper not available<br><br>

**[DuetSVG: Unified Multimodal SVG Generation with Internal Visual Guidance](https://arxiv.org/pdf/2512.10894)**<br>Paper not available<br><br>

**[Confucius Code Agent: An Open-sourced AI Software Engineer at Industrial Scale](https://arxiv.org/pdf/2512.10398)**<br>Paper not available<br><br>

**[X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale](https://arxiv.org/pdf/2512.04537)**<br>Paper not available<br><br>

**[MOA: Multi-Objective Alignment for Role-Playing Agents](https://arxiv.org/pdf/2512.09756)**<br>Paper not available<br><br>

**[DragMesh: Interactive 3D Generation Made Easy](https://arxiv.org/pdf/2512.06424)**<br>Paper not available<br><br>