Priority Task : start using trulens to evaluate Gemini #1

Josephrp · 2023-12-15T13:16:11Z

🤔How To

Check our References

trulens github + notebooks : https://github.com/truera/trulens/tree/main/trulens_eval/examples

RAG Evaluation : https://lablab.ai/t/trulens-google-vertex-ai-tutorial-building-rag-applications
Comparing models with TruLens : https://github.com/truera/trulens/blob/main/trulens_eval/examples/expositional/use_cases/model_comparison.ipynb
Llama Index / Chain of Reasoning Eval with Trulens :
https://github.com/truera/trulens/blob/main/trulens_eval/examples/expositional/frameworks/llama_index/llama_index_complex_evals.ipynb
Llama Index Retrieval Quality : https://github.com/truera/trulens/blob/main/trulens_eval/examples/expositional/frameworks/llama_index/llama_index_retrievalquality.ipynb

Ideas for Evaluation

RAG
System Prompt
Data Processing Pipeline
Image Inputs

Work

What it takes : literally just running a notebook.

chroma , or embeddings to test
list of prompts to test
test combinations of prompts
multimodal evaluations

we will include the notebooks in the submission and write up

Josephrp · 2023-12-16T11:47:07Z

hey there @mie-h and @Zochory : https://github.com/Tonic-AI/DataTonic/tree/main/evaluation this is a folder where we will first start working on the trulens evaluations which are a hackathon requirement + good practice while building an app 🫡

Josephrp · 2023-12-16T14:02:17Z

hey there @mie-h & @Zochory : i added default prompts to the baseline prompts folder we can use those in a trulens evaluation.

Josephrp · 2023-12-16T15:03:26Z

consider using this to generate "system prompts" for gemini

Josephrp · 2023-12-16T23:23:36Z

added an incomplete example notebook : https://github.com/Tonic-AI/DataTonic/blob/main/evaluation/results/modelcomparision.ipynb

Josephrp · 2023-12-18T10:45:35Z

big thank you to 🏆😎 @MN-Noor for producing the first TruLens with gemini on RAG using open ai!

Open tasks :

Make a notebook to test Gemini MultiModal (image inputs)
Make a notebook to test more models against Gemini
Make a notebook to test the "new features of Gemini" like the censorship level.

we'll all work on this together, normally if everyone does one, or at least contributes to a good one we will have secured this task.

Zochory · 2023-12-18T22:21:44Z

Est-ce que l'on ajouterait pas d'autres multimodal LLM ?
comme celui ci dans les evals ? https://huggingface.co/sshh12/Mistral-7B-LoRA-ImageBind-LLAVA

Josephrp assigned mie-h and Zochory Dec 15, 2023

Josephrp mentioned this issue Dec 16, 2023

Brainstorming concept : DataTonic, help and find the most optimized LLM model for an user usecase #3

Closed

Josephrp assigned MN-Noor Dec 16, 2023

Josephrp pinned this issue Dec 16, 2023

Josephrp assigned GoldenWind8, twilwa and Josephrp Dec 16, 2023

Josephrp added the help wanted Extra attention is needed label Dec 17, 2023

Josephrp assigned jsaluja and unassigned mie-h Dec 17, 2023

Josephrp added the good first issue Good for newcomers label Dec 18, 2023

MN-Noor linked a pull request Dec 18, 2023 that will close this issue

Add files via upload #19

Merged

Josephrp closed this as completed in #19 Dec 18, 2023

Josephrp reopened this Dec 18, 2023

Josephrp changed the title ~~start using trulens to evaluate Gemini~~ Priority Task : start using trulens to evaluate Gemini Dec 19, 2023

Josephrp mentioned this issue Dec 19, 2023

Open Tasks : PR TruEra/TruLens #37

Closed

Josephrp unassigned GoldenWind8, jsaluja, twilwa and Zochory Dec 20, 2023

Repository owner deleted a comment from twilwa Dec 21, 2023

Josephrp closed this as completed Dec 21, 2023

Josephrp unpinned this issue Dec 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Priority Task : start using trulens to evaluate Gemini #1

Priority Task : start using trulens to evaluate Gemini #1

Josephrp commented Dec 15, 2023 •

edited

Loading

Josephrp commented Dec 16, 2023

Josephrp commented Dec 16, 2023

Josephrp commented Dec 16, 2023

Josephrp commented Dec 16, 2023

Josephrp commented Dec 18, 2023

Zochory commented Dec 18, 2023

Priority Task : start using trulens to evaluate Gemini #1

Priority Task : start using trulens to evaluate Gemini #1

Comments

Josephrp commented Dec 15, 2023 • edited Loading

🤔How To

Check our References

Ideas for Evaluation

Work

Josephrp commented Dec 16, 2023

Josephrp commented Dec 16, 2023

Josephrp commented Dec 16, 2023

Josephrp commented Dec 16, 2023

Josephrp commented Dec 18, 2023

Zochory commented Dec 18, 2023

Josephrp commented Dec 15, 2023 •

edited

Loading