In [1]:
import pandas as pd

# Display the complete contents of dataframe cells.
pd.set_option("display.max_colwidth", None)

In [2]:

import os
from getpass import getpass

import openai

if not (openai_api_key := os.getenv("OPENAI_API_KEY")):
    openai_api_key = getpass("🔑 Enter your OpenAI API key: ")
openai.api_key = openai_api_key
os.environ["OPENAI_API_KEY"] = openai_api_key

In [3]:

import phoenix as px
from llama_index.core import set_global_handler
from phoenix.trace.langchain import LangChainInstrumentor

session = px.launch_app()

# Setup instrumentation for both llama-index and LangChain (used by Ragas)
set_global_handler("arize_phoenix")
LangChainInstrumentor().instrument()

🌍 To view the Phoenix app in your browser, visit http://localhost:6006/
📺 To view the Phoenix app in a notebook, run `px.active_session().view()`
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix


In [4]:
from llama_index.core import SimpleDirectoryReader

dir_path = "./data/prompt-engineering-papers"
reader = SimpleDirectoryReader(dir_path, num_files_limit=2)
documents = reader.load_data()

In [5]:
from phoenix.trace import using_project
from ragas.testset.evolutions import multi_context, reasoning, simple
from ragas.testset.generator import TestsetGenerator

TEST_SIZE = 5

# generator with openai models
generator = TestsetGenerator.with_openai(
    generator_llm="gpt-3.5-turbo-0125",
    critic_llm="gpt-3.5-turbo-0125",
    embeddings="text-embedding-3-large",
)

# set question type distribution
distribution = {simple: 0.5, reasoning: 0.25, multi_context: 0.25}

# generate testset
with using_project("ragas-testset"):
    testset = generator.generate_with_llamaindex_docs(
        documents, test_size=TEST_SIZE, distributions=distribution
    )
test_df = testset.to_pandas()
test_df.head()

  generator = TestsetGenerator.with_openai(


embedding nodes:   0%|          | 0/144 [00:00<?, ?it/s]

Filename and doc_id are the same for all nodes.


Generating:   0%|          | 0/5 [00:00<?, ?it/s]

max retries exceeded for MultiContextEvolution(generator_llm=LangchainLLMWrapper(run_config=RunConfig(timeout=60, max_retries=15, max_wait=90, max_workers=16, exception_types=<class 'openai.RateLimitError'>)), docstore=InMemoryDocumentStore(splitter=<langchain_text_splitters.base.TokenTextSplitter object at 0x7f68fcecd240>, nodes=[Node(page_content='arXiv:1605.08386v1  [math.CO]  26 May 2016HEAT-BATH RANDOM WALKS WITH MARKOV BASES\nCAPRICE STANLEY AND TOBIAS WINDISCH\nAbstract. Graphs on lattice points are studied whose edges come from a ﬁ nite set of\nallowed moves of arbitrary length. We show that the diameter of these graphs on ﬁbers of a\nﬁxed integer matrix can be bounded from above by a constant. W e then study the mixing\nbehaviour of heat-bath random walks on these graphs. We also state explicit conditions\non the set of moves so that the heat-bath random walk, a genera lization of the Glauber\ndynamics, is an expander in ﬁxed dimension.\nContents\n1. Introduction 1\n2. Graphs 

Unnamed: 0,question,contexts,ground_truth,evolution_type,metadata,episode_done
0,What are some of the advanced training strategies discussed in the context of In-Context Learning (ICL)?,"[ment aims to improve the scalability and efficiency\nof ICL. As LMs continue to scale up, exploring\nways to effectively and efficiently utilize a larger\nnumber of demonstrations in ICL remains an ongo-\ning area of research.\n12 Conclusion\nIn this paper, we survey the existing ICL literature\nand provide an extensive review of advanced ICL\ntechniques, including training strategies, demon-\nstration designing strategies, evaluation datasets\nand resources, as well as related analytical studies.\nFurthermore, we highlight critical challenges and\npotential directions for future research. To the best\nof our knowledge, this is the first survey about ICL.\nWe hope this survey can highlight the current re-\nsearch status of ICL and shed light on future work\non this promising paradigm.\nReferences\nEkin Akyürek, Dale Schuurmans, Jacob An-\ndreas, Tengyu Ma, and Denny Zhou. 2022.\nWhat learning algorithm is in-context learn-\ning? investigations with linear models. CoRR ,\nabs/2211.15661.\nJean-Baptiste Alayrac, Jeff Donahue, Pauline Luc,\nAntoine Miech, Iain Barr, Yana Hasson, Karel\nLenc, Arthur Mensch, Katherine Millican, Mal-\ncolm Reynolds, et al. 2022. Flamingo: a vi-\nsual language model for few-shot learning. Ad-\nvances in Neural Information Processing Sys-\ntems, 35:23716–23736.\nShengnan An, Zeqi Lin, Qiang Fu, Bei Chen,\nNanning Zheng, Jian-Guang Lou, and Dong-\nmei Zhang. 2023. How do in-context exam-\nples affect compositional generalization? CoRR ,\nabs/2305.04835.\nAmir Bar, Yossi Gandelsman, Trevor Darrell,\nAmir Globerson, and Alexei Efros. 2022. Vi-\nsual prompting via image inpainting. Ad-\nvances in Neural Information Processing Sys-\ntems, 35:25005–25017.\nRichard Bellman. 1957. A markovian decision\nprocess. Journal of mathematics and mechanics ,\npages 679–684.\nRishi Bommasani, Drew A. Hudson, Ehsan Adeli,\nRuss Altman, Simran Arora, Sydney von Arx,Michael S. Bernstein, Jeannette Bohg, Antoine\nBosselut, Emma Brunskill, Erik Brynjolfsson,\nS. Buch, Dallas Card, Rodrigo Castellon, Ni-\nladri S. Chatterji, Annie S. Chen, Kathleen A.\nCreel, Jared Davis, Dora Demszky, Chris Don-\nahue, Moussa Doumbouya, Esin Durmus, Ste-\nfano Ermon, John Etchemendy, Kawin Etha-\nyarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale,\nLauren E. Gillespie, Karan Goel, Noah D.\nGoodman, Shelby Grossman, Neel Guha, Tat-\nsunori Hashimoto, Peter Henderson, John He-\nwitt, Daniel E. Ho, Jenny Hong, Kyle Hsu,\nJing Huang, Thomas F. Icard, Saahil Jain, Dan\nJurafsky, Pratyusha Kalluri, Siddharth Karam-\ncheti, Geoff Keeling, Fereshte Khani, O. Khat-\ntab, Pang Wei Koh, Mark S. Krass, Ranjay Kr-\nishna, Rohith Kuditipudi, Ananya Kumar, Faisal\nLadhak, Mina Lee, Tony Lee, Jure Leskovec,\nIsabelle Levent, Xiang Lisa Li, Xuechen Li,\nTengyu Ma, Ali Malik, Christopher D. Man-\nning, Suvir P. Mirchandani, Eric Mitchell,\nZanele Munyikwa, Suraj Nair, Avanika Narayan,\nDeepak Narayanan, Benjamin Newman, Allen\nNie, Juan Carlos Niebles, Hamed Nilforoshan,\nJ. F. Nyarko, Giray Ogut, Laurel Orr, Isabel\nPapadimitriou, Joon Sung Park, Chris Piech,\nEva Portelance, Christopher Potts, Aditi Raghu-\nnathan, Robert Reich, Hongyu Ren, Frieda\nRong, Yusuf H. Roohani, Camilo Ruiz, Jack\nRyan, Christopher R’e, Dorsa Sadigh, Shiori\n]","The context discusses advanced training strategies in In-Context Learning (ICL), including training strategies, demonstration designing strategies, evaluation datasets, and related analytical studies.",simple,"[{'page_label': '13', 'file_name': '2301.00234v3.A_Survey_on_In_context_Learning.pdf', 'file_path': '/home/peter-legion-wsl2/peter-projects/regen-ai/nbs/data/prompt-engineering-papers/2301.00234v3.A_Survey_on_In_context_Learning.pdf', 'file_type': 'application/pdf', 'file_size': 4898135, 'creation_date': '2024-04-13', 'last_modified_date': '2024-04-13'}]",True
1,What is the relationship between Gröbner bases and convex polytopes according to Bernd Sturmfels?,"[20 CAPRICE STANLEY AND TOBIAS WINDISCH\n23. Alistair Sinclair, Improved Bounds for Mixing Rates of Markov Chains and Multic ommodity Flow , Com-\nbinatorics, Probability and Computing 1(1992), no. 4, 351–370.\n24. Bernd Sturmfels, Gr¨ obner bases and convex polytopes , American Mathematical Society, Providence, R.I.,\n1996.\n25. Seth Sullivant, Markov bases of binary graph models , Annals of Combinatorics 7(2003), 441–466.\n26. Santosh S. Vempala, Geometric Random Walks: A Survey , MSRI Combinatorial and Computational\nGeometry 52(2005), 573–612.\n27. Tobias Windisch, Rapid mixing and Markov bases , preprint, arXiv:1505.03018 (2015), 1–18.\nNC State University, Raleigh, NC 27695, USA\nE-mail address :crstanl2@ncsu.edu\nOtto-von-Guericke Universit ¨at, Magdeburg, Germany\nE-mail address :windisch@ovgu.de]",The relationship between Gröbner bases and convex polytopes is discussed by Bernd Sturmfels in his work.,simple,"[{'page_label': '20', 'file_name': '1605.08386v1.Heat_bath_random_walks_with_Markov_bases.pdf', 'file_path': '/home/peter-legion-wsl2/peter-projects/regen-ai/nbs/data/prompt-engineering-papers/1605.08386v1.Heat_bath_random_walks_with_Markov_bases.pdf', 'file_type': 'application/pdf', 'file_size': 289178, 'creation_date': '2024-04-13', 'last_modified_date': '2024-04-13'}]",True
2,"What area of research in in-context learning focuses on using more demonstrations efficiently, and what future research directions are highlighted?","[ment aims to improve the scalability and efficiency\nof ICL. As LMs continue to scale up, exploring\nways to effectively and efficiently utilize a larger\nnumber of demonstrations in ICL remains an ongo-\ning area of research.\n12 Conclusion\nIn this paper, we survey the existing ICL literature\nand provide an extensive review of advanced ICL\ntechniques, including training strategies, demon-\nstration designing strategies, evaluation datasets\nand resources, as well as related analytical studies.\nFurthermore, we highlight critical challenges and\npotential directions for future research. To the best\nof our knowledge, this is the first survey about ICL.\nWe hope this survey can highlight the current re-\nsearch status of ICL and shed light on future work\non this promising paradigm.\nReferences\nEkin Akyürek, Dale Schuurmans, Jacob An-\ndreas, Tengyu Ma, and Denny Zhou. 2022.\nWhat learning algorithm is in-context learn-\ning? investigations with linear models. CoRR ,\nabs/2211.15661.\nJean-Baptiste Alayrac, Jeff Donahue, Pauline Luc,\nAntoine Miech, Iain Barr, Yana Hasson, Karel\nLenc, Arthur Mensch, Katherine Millican, Mal-\ncolm Reynolds, et al. 2022. Flamingo: a vi-\nsual language model for few-shot learning. Ad-\nvances in Neural Information Processing Sys-\ntems, 35:23716–23736.\nShengnan An, Zeqi Lin, Qiang Fu, Bei Chen,\nNanning Zheng, Jian-Guang Lou, and Dong-\nmei Zhang. 2023. How do in-context exam-\nples affect compositional generalization? CoRR ,\nabs/2305.04835.\nAmir Bar, Yossi Gandelsman, Trevor Darrell,\nAmir Globerson, and Alexei Efros. 2022. Vi-\nsual prompting via image inpainting. Ad-\nvances in Neural Information Processing Sys-\ntems, 35:25005–25017.\nRichard Bellman. 1957. A markovian decision\nprocess. Journal of mathematics and mechanics ,\npages 679–684.\nRishi Bommasani, Drew A. Hudson, Ehsan Adeli,\nRuss Altman, Simran Arora, Sydney von Arx,Michael S. Bernstein, Jeannette Bohg, Antoine\nBosselut, Emma Brunskill, Erik Brynjolfsson,\nS. Buch, Dallas Card, Rodrigo Castellon, Ni-\nladri S. Chatterji, Annie S. Chen, Kathleen A.\nCreel, Jared Davis, Dora Demszky, Chris Don-\nahue, Moussa Doumbouya, Esin Durmus, Ste-\nfano Ermon, John Etchemendy, Kawin Etha-\nyarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale,\nLauren E. Gillespie, Karan Goel, Noah D.\nGoodman, Shelby Grossman, Neel Guha, Tat-\nsunori Hashimoto, Peter Henderson, John He-\nwitt, Daniel E. Ho, Jenny Hong, Kyle Hsu,\nJing Huang, Thomas F. Icard, Saahil Jain, Dan\nJurafsky, Pratyusha Kalluri, Siddharth Karam-\ncheti, Geoff Keeling, Fereshte Khani, O. Khat-\ntab, Pang Wei Koh, Mark S. Krass, Ranjay Kr-\nishna, Rohith Kuditipudi, Ananya Kumar, Faisal\nLadhak, Mina Lee, Tony Lee, Jure Leskovec,\nIsabelle Levent, Xiang Lisa Li, Xuechen Li,\nTengyu Ma, Ali Malik, Christopher D. Man-\nning, Suvir P. Mirchandani, Eric Mitchell,\nZanele Munyikwa, Suraj Nair, Avanika Narayan,\nDeepak Narayanan, Benjamin Newman, Allen\nNie, Juan Carlos Niebles, Hamed Nilforoshan,\nJ. F. Nyarko, Giray Ogut, Laurel Orr, Isabel\nPapadimitriou, Joon Sung Park, Chris Piech,\nEva Portelance, Christopher Potts, Aditi Raghu-\nnathan, Robert Reich, Hongyu Ren, Frieda\nRong, Yusuf H. Roohani, Camilo Ruiz, Jack\nRyan, Christopher R’e, Dorsa Sadigh, Shiori\n]",Exploring ways to effectively and efficiently utilize a larger number of demonstrations in ICL remains an ongoing area of research. Future research directions highlighted include critical challenges and potential directions for future research.,reasoning,"[{'page_label': '13', 'file_name': '2301.00234v3.A_Survey_on_In_context_Learning.pdf', 'file_path': '/home/peter-legion-wsl2/peter-projects/regen-ai/nbs/data/prompt-engineering-papers/2301.00234v3.A_Survey_on_In_context_Learning.pdf', 'file_type': 'application/pdf', 'file_size': 4898135, 'creation_date': '2024-04-13', 'last_modified_date': '2024-04-13'}]",True
3,What benefits does ICL offer for communication with LLMs and its resemblance to human decision-making?,"[A Survey on In-context Learning\nQingxiu Dong1, Lei Li1, Damai Dai1, Ce Zheng1, Zhiyong Wu2,\nBaobao Chang1, Xu Sun1, Jingjing Xu2, Lei Li3and Zhifang Sui1\n1MOE Key Lab of Computational Linguistics, School of Computer Science, Peking University\n2Shanghai AI Lab3University of California, Santa Barbara\n{dqx,lilei}@stu.pku.edu.cn, wuzhiyong@pjlab.org.cn, lilei@cs.ucsb.edu\n{daidamai,zce1112zslx,chbb,xusun,jingjingxu,szf}@pku.edu.cn\nAbstract\nWith the increasing ability of large language\nmodels (LLMs), in-context learning (ICL)\nhas become a new paradigm for natural\nlanguage processing (NLP), where LLMs\nmake predictions only based on contexts aug-\nmented with a few examples. It has been a\nnew trend to explore ICL to evaluate and ex-\ntrapolate the ability of LLMs. In this paper,\nwe aim to survey and summarize the progress\nand challenges of ICL. We first present a for-\nmal definition of ICL and clarify its corre-\nlation to related studies. Then, we organize\nand discuss advanced techniques, including\ntraining strategies, demonstration designing\nstrategies, as well as related analysis. Finally,\nwe discuss the challenges of ICL and provide\npotential directions for further research. We\nhope that our work can encourage more re-\nsearch on uncovering how ICL works and\nimproving ICL.\n1 Introduction\nWith the scaling of model size and corpus size (De-\nvlin et al., 2019; Radford et al., 2019; Brown et al.,\n2020; Chowdhery et al., 2022), large language\nmodels (LLMs) demonstrate an in-context learn-\ning (ICL) ability, that is, learning from a few ex-\namples in the context. Many studies have shown\nthat LLMs can perform a series of complex tasks\nthrough ICL, such as solving mathematical reason-\ning problems (Wei et al., 2022c). These strong abil-\nities have been widely verified as emerging abilities\nfor large language models (Wei et al., 2022b).\nThe key idea of in-context learning is to learn\nfrom analogy. Figure 1 gives an example describ-\ning how language models make decisions with ICL.\nFirst, ICL requires a few examples to form a demon-\nstration context. These examples are usually writ-\nten in natural language templates. Then, ICL con-\ncatenates a query question and a piece of demon-\nstration context together to form a prompt, which\nReview: Delicious food! Review: The food is awful. … Review: Terrible dishes!\nPositiveLarge Language ModelReview: Good meal!Sentiment:\nInputSentiment: PositiveSentiment: Negative…Sentiment: NegativeOutputParameter Freeze kDemonstrationExamplesNewQuery \nTemplateDelicious food! The food is awful. Terrible dishes!…Review: [Text] Sentiment: [Label]TextLabel100…\nFigure 1: Illustration of in-context learning. ICL re-\nquires a piece of demonstration context containing a few\nexamples written in natural language templates. Taking\nthe demonstration and a query as the input, large lan-\nguage models are responsible for making predictions.\nis then fed into the language model for prediction.\nDifferent from supervised learning requiring a train-\ning stage that uses backward gradients to update\nmodel parameters, ICL does not conduct parameter\nupdates and directly performs predictions on the\npretrained language models. The model is expected\nto learn the pattern hidden in the demonstration and\naccordingly make the right prediction.\nAs a new paradigm, ICL has multiple attractive\nadvantages. First, since the demonstration is writ-\nten in natural language, it provides an interpretable\ninterface to communicate with LLMs (Brown et al.,\n2020). This paradigm makes it much easier to in-\ncorporate human knowledge into LLMs by chang-\ning the demonstration and templates (Liu et al.,\n2022; Lu et al., 2022; Wu et al., 2022; Wei et al.,\n2022c). Second, in-context learning is similar to\nthe decision process of human beings by learning\nfrom analogy (Winston, 1980). Third, compared\nwith supervised training, ICL is a training-free\nlearning framework. This could not only greatly re-\nduce the computation costs for adapting the model\n]","ICL offers benefits for communication with LLMs by providing an interpretable interface through natural language, making it easier to incorporate human knowledge into LLMs. Additionally, ICL resembles human decision-making by learning from analogy, similar to how humans make decisions.",multi_context,"[{'page_label': '1', 'file_name': '2301.00234v3.A_Survey_on_In_context_Learning.pdf', 'file_path': '/home/peter-legion-wsl2/peter-projects/regen-ai/nbs/data/prompt-engineering-papers/2301.00234v3.A_Survey_on_In_context_Learning.pdf', 'file_type': 'application/pdf', 'file_size': 4898135, 'creation_date': '2024-04-13', 'last_modified_date': '2024-04-13'}]",True
