From 453f3d5fa1cd6d14d61f203ddb1bce9abedc2047 Mon Sep 17 00:00:00 2001 From: Linghua Jin Date: Tue, 18 Mar 2025 01:01:45 -0700 Subject: [PATCH 1/2] Update README.md --- examples/manuals_llm_extraction/README.md | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/examples/manuals_llm_extraction/README.md b/examples/manuals_llm_extraction/README.md index 65b74a5f..dc2e5956 100644 --- a/examples/manuals_llm_extraction/README.md +++ b/examples/manuals_llm_extraction/README.md @@ -1,9 +1,16 @@ +# Structured Data Extraction from PDF with Ollama and CocoIndex + +![Structured data extraction with Ollama and CocoIndex](https://cocoindex.io/blogs/assets/images/cocoindex-ollama-structured-extraction-from-pdf-6ee15b1e0fe304063dc78f04153fb385.png) + + In this example, we * Converts PDFs (generated from a few Python docs) into Markdown. * Extract structured information from the Markdown using LLM. * Use a custom function to further extract information from the structured output. +Please give [Cocoindex on Github](https://github.com/cocoindex-io/cocoindex) a star to support us if you like our work. Thank you so much with a warm coconut hug 🥥🤗. [![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex) + ## Prerequisite Before running the example, you need to: @@ -47,9 +54,14 @@ And run the SQL query: ```sql SELECT filename, module_info->'title' AS title, module_summary FROM modules_info; ``` +You should see results like: + +![Module Info Index](https://cocoindex.io/blogs/assets/images/module_info_index-ffaec6042ec3a18eaf94bed5b227a085.png) + ## CocoInsight -CocoInsight is in Early Access now (Free) 😊 You found us! A quick 3 minute video tutorial about CocoInsight: [Watch on YouTube](https://youtu.be/ZnmyoHslBSc?si=pPLXWALztkA710r9). +CocoInsight is in Early Access now (Free) 😊 You found us! It is +A quick 3 minute video tutorial about CocoInsight: [Watch on YouTube](https://youtu.be/ZnmyoHslBSc?si=pPLXWALztkA710r9). Run CocoInsight to understand your RAG data pipeline: @@ -57,4 +69,7 @@ Run CocoInsight to understand your RAG data pipeline: python main.py cocoindex server -c https://cocoindex.io ``` -Then open the CocoInsight UI at [https://cocoindex.io/cocoinsight](https://cocoindex.io/cocoinsight). \ No newline at end of file +Then open the CocoInsight UI at [https://cocoindex.io/cocoinsight](https://cocoindex.io/cocoinsight). It connects to your local CocoIndex server with zero data retention. + +You can view the pipeline flow and the data preview in the CocoInsight UI: +![CocoInsight UI](https://cocoindex.io/blogs/assets/images/cocoinsight-edd71690dcc35b6c5cf1cb31b51b6f6f.png) From e2e4b315f54f68de56b44ca56c97ed9022bda5f8 Mon Sep 17 00:00:00 2001 From: Linghua Jin Date: Tue, 18 Mar 2025 01:03:17 -0700 Subject: [PATCH 2/2] Update README.md --- examples/manuals_llm_extraction/README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/examples/manuals_llm_extraction/README.md b/examples/manuals_llm_extraction/README.md index dc2e5956..8de33d9e 100644 --- a/examples/manuals_llm_extraction/README.md +++ b/examples/manuals_llm_extraction/README.md @@ -60,8 +60,7 @@ You should see results like: ## CocoInsight -CocoInsight is in Early Access now (Free) 😊 You found us! It is -A quick 3 minute video tutorial about CocoInsight: [Watch on YouTube](https://youtu.be/ZnmyoHslBSc?si=pPLXWALztkA710r9). +CocoInsight is a tool to help you understand your data pipeline and data index. CocoInsight is in Early Access now (Free) 😊 You found us! A quick 3 minute video tutorial about CocoInsight: [Watch on YouTube](https://youtu.be/ZnmyoHslBSc?si=pPLXWALztkA710r9). Run CocoInsight to understand your RAG data pipeline: