Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added jupyter notebook for Geospatial Blog #231

Merged
merged 38 commits into from
May 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
c9d5468
Added jupyter notebook for Geospatial Blog
philippkahr May 3, 2024
56ad4b1
Tokenization notebook: fix step_size calculation (#233)
maxjakob May 8, 2024
8f68be1
notebook for blog post on RBAC and RAG (#234)
jeffvestal May 8, 2024
7afdafb
[Playground] notebooks for ingestion suitable for playground (#232)
joemcelroy May 9, 2024
020e937
update colab links (#236)
joemcelroy May 10, 2024
b41721f
Removed iamge
philippkahr May 13, 2024
85296a1
Adds Cohere & Elasticsearch tutorial notebook (#235)
szabosteve May 13, 2024
54efa08
Stop using perform_request in search tutorial (#238)
pquentin May 14, 2024
4c60c28
fix: keyword & filtering colab link (#241)
TattdCodeMonkey May 16, 2024
b37a6b5
fixed a few tiny things regarding the geospatial searc
philippkahr May 17, 2024
f8bb1aa
Splitter notebook: extend new strategy class (#242)
maxjakob May 17, 2024
ab8f984
Moved to another folder
philippkahr May 21, 2024
c1cc63c
adding content for an upcoming blog post (#247)
JessicaGarson May 21, 2024
79aa22e
updating example (#248)
JessicaGarson May 21, 2024
4b8923d
Show pip dependencies after CI run (#239)
miguelgrinberg May 22, 2024
d4ae76f
Add per-version testing restrictions (#246)
miguelgrinberg May 22, 2024
c12c84f
added notebook for retrievers blog (#245)
jeffvestal May 22, 2024
5f505a1
Iclude code to blog using-nvidia-nim-with-elasticsearch-vector-store …
salgado May 24, 2024
bb87e8f
Fix bedrock integration and remove incorrect part regarding ELSER (#249)
yansavitski May 29, 2024
504b4a7
Let's see if this is fixed finally
philippkahr May 29, 2024
bf31de5
Reformated with black formatter, maybe that does the trick?
philippkahr May 29, 2024
beac6d7
is it the cell output, or is it the code that is breaking, let's check
philippkahr May 29, 2024
b047529
Rerun the broken cell with the `distance between airbnb` to see if it…
philippkahr May 29, 2024
4df2bfe
maybe now?
philippkahr May 29, 2024
b206483
What exactly is the problem? The print statement?
philippkahr May 29, 2024
8b83dac
Reformatted with the pre-commit hook
philippkahr May 29, 2024
9dc6b53
Added jupyter notebook for Geospatial Blog
philippkahr May 3, 2024
89a4e31
Removed iamge
philippkahr May 13, 2024
7c9ece8
fixed a few tiny things regarding the geospatial searc
philippkahr May 17, 2024
c62d465
Moved to another folder
philippkahr May 21, 2024
022900b
Let's see if this is fixed finally
philippkahr May 29, 2024
3bf8571
Reformated with black formatter, maybe that does the trick?
philippkahr May 29, 2024
a408210
is it the cell output, or is it the code that is breaking, let's check
philippkahr May 29, 2024
b5547f7
Rerun the broken cell with the `distance between airbnb` to see if it…
philippkahr May 29, 2024
298a433
maybe now?
philippkahr May 29, 2024
ba72137
What exactly is the problem? The print statement?
philippkahr May 29, 2024
07890f2
Reformatted with the pre-commit hook
philippkahr May 29, 2024
7b88d68
Merge branch 'geospatial' of https://github.com/philippkahr/elasticse…
philippkahr May 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,9 @@ jobs:
run: make install-nbtest
- name: Warm up
continue-on-error: true
run: sleep 30 && PATCH_ES=1 ELASTIC_CLOUD_ID=foo ELASTIC_API_KEY=bar bin/nbtest notebooks/search/00-quick-start.ipynb
run: sleep 30 && PATCH_ES=1 ELASTIC_CLOUD_ID=foo ELASTIC_API_KEY=bar ES_STACK=${{ matrix.es_stack }} bin/nbtest notebooks/search/00-quick-start.ipynb
- name: Run tests
run: PATCH_ES=1 FORCE_COLOR=1 make -s test
run: PATCH_ES=1 FORCE_COLOR=1 ES_STACK=${{ matrix.es_stack }} make -s test
- name: Show installed Python dependencies
if: always()
run: .venv/bin/pip freeze
8 changes: 7 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,13 @@ To run all supported notebooks under this tool, run the following from the top-l
make test
```

To add a new notebook to our automated testing, you will need to modify the `Makefile` in the directory where your notebook is located. If there is no Makefile in your directory, you can use the one in the `notebooks/search` directory as a model to create one, and then add a reference to it in the top-level `Makefile`.
If you want to test under a previous release of Elasticsearch, you can define the `ES_STACK` variable with the target version so that notebooks that are not intended for that or previous versions are skipped. For example, here is how to only run tests for the 8.12 release of Elasticsearch:

```bash
ES_STACK=8.12 make test
```

Any notebooks that are added in subdirectories under `notebooks` are automatically picked up for testing. Notebooks that are known to not be suitable for automatic testing should be added to the list of exempt tests in the `bin/find_notebooks_to_test.sh` script. If the notebook should never be tested, add it to the `EXEMPT_NOTEBOOKS` list. If the notebook should be tested only under some versions of Elasticsearch, then add it to the appropriate versioned exemption list. For example, the `EXEMPT_NOTEBOOKS__8_12` list should include notebooks that are not be tested under releases 8.12 and older. New versioned lists are automatically recognized and can be added as needed.

## Contributing to example applications 💻

Expand Down
43 changes: 41 additions & 2 deletions bin/find-notebooks-to-test.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/bin/bash
# add any notebooks that are currently not testable to the exempt list

EXEMPT_NOTEBOOKS=(
# Add any notebooks that are currently not testable to the exempt list
"notebooks/esql/esql-getting-started.ipynb"
"notebooks/search/07-inference.ipynb"
"notebooks/search/08-learning-to-rank.ipynb"
Expand All @@ -14,6 +15,7 @@ EXEMPT_NOTEBOOKS=(
"notebooks/generative-ai/question-answering.ipynb"
"notebooks/generative-ai/chatbot.ipynb"
"notebooks/integrations/amazon-bedrock/langchain-qa-example.ipynb"
"notebooks/integrations/cohere/cohere-elasticsearch.ipynb"
"notebooks/integrations/cohere/inference-cohere.ipynb"
"notebooks/integrations/llama-index/intro.ipynb"
"notebooks/integrations/gemini/vector-search-gemini-elastic.ipynb"
Expand All @@ -24,9 +26,46 @@ EXEMPT_NOTEBOOKS=(
"notebooks/enterprise-search/app-search-engine-exporter.ipynb"
)

# Per-version testing exceptions
# use variables named EXEMPT_NOTEBOOKS__{major}_[minor} to list notebooks that
# cannot run on that stack version or older
# Examples:
# EXEMPT_NOTEBOOKS__8 for notebooks that must be skipped on all versions 8.x and older
# EXEMPT_NOTEBOOKS__8_12 for notebooks that must skipped on versions 8.12 and older

EXEMPT_NOTEBOOKS__8_12=(
# Add any notebooks that must be skipped on versions 8.12 or older here
"notebooks/document-chunking/with-index-pipelines.ipynb"
"notebooks/document-chunking/with-langchain-splitters.ipynb"
"notebooks/integrations/hugging-face/loading-model-from-hugging-face.ipynb"
"notebooks/langchain/langchain-using-own-model.ipynb"
)

# this function parses a version given as M[.N[.P]] or M[_N[_P]] into a numeric form
function parse_version { echo "$@" | awk -F'[._]' '{ printf("%02d%02d\n", $1, $2); }'; }

# this is the version CI is running
ci_version=$(parse_version ${ES_STACK:-99.99})

ALL_NOTEBOOKS=$(find notebooks -name "*.ipynb" | grep -v "_nbtest" | grep -v ".ipynb_checkpoints" | sort)
for notebook in $ALL_NOTEBOOKS; do
if [[ ! "${EXEMPT_NOTEBOOKS[@]}" =~ $notebook ]]; then
skip=
# check the master exception list
if [[ "${EXEMPT_NOTEBOOKS[@]}" =~ $notebook ]]; then
skip=yes
else
# check the versioned exception lists
for exempt_key in ${!EXEMPT_NOTEBOOKS__*}; do
exempt_version=$(parse_version ${exempt_key/EXEMPT_NOTEBOOKS__/})
if [ $exempt_version -ge $ci_version ]; then
exempt_notebooks=$(eval 'echo ${'${exempt_key}'[@]}')
if [[ "${exempt_notebooks[@]}" =~ $notebook ]]; then
skip=yes
fi
fi
done
fi
if [[ "$skip" == "" ]]; then
echo $notebook
fi
done
8 changes: 3 additions & 5 deletions example-apps/search-tutorial/v3/search-tutorial/search.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,15 +70,13 @@ def reindex(self):

def search(self, **query_args):
# sub_searches is not currently supported in the client, so we send
# search requests as raw requests
# search requests using the body argument
if "from_" in query_args:
query_args["from"] = query_args["from_"]
del query_args["from_"]
return self.es.perform_request(
"GET",
f"/my_documents/_search",
return self.es.search(
index="my_documents",
body=json.dumps(query_args),
headers={"Content-Type": "application/json", "Accept": "application/json"},
)

def retrieve_document(self, id):
Expand Down
7 changes: 5 additions & 2 deletions notebooks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,16 @@ Notebooks are organized into the following folders:

- [`search`](./search/): Notebooks that demonstrate the fundamentals of Elasticsearch, like indexing embeddings, running lexical, semantic and _hybrid_ searches, and more.

- [`enterprise-search`](./enterprise-search/): Notebooks that demonstrate use cases for working with and exporting from Elastic Enterprise Search, App Search, or Workplace Search.
- [`doc-ingestion-and-chunking`](./ingestion-and-chunking/): Notebooks that demonstrate how to ingest and chunk documents for indexing in Elasticsearch from PDF, HTML and JSON with ELSER.

- [`generative-ai`](./generative-ai/): Notebooks that demonstrate various use cases for Elasticsearch as the retrieval engine and vector store for LLM-powered applications.

- [`langchain`](./langchain/): Notebooks that demonstrate how to integrate Elastic with [LangChain](https://langchain-langchain.vercel.app/docs/get_started/introduction.html), a framework for developing applications powered by language models.

- [`integrations`](./integrations/): Notebooks that demonstrate how to integrate popular services and projects with Elasticsearch:

- [OpenAI](./integrations/openai)
- [Hugging Face](./integrations/hugging-face)
- [LlamaIndex](./integrations/llama-index)

- [`langchain`](./langchain/): Notebooks that demonstrate how to integrate Elastic with [LangChain](https://langchain-langchain.vercel.app/docs/get_started/introduction.html), a framework for developing applications powered by language models.
- [`enterprise-search`](./enterprise-search/): Notebooks that demonstrate use cases for working with and exporting from Elastic Enterprise Search, App Search, or Workplace Search.
Loading
Loading