# Example: Directly accessing BEAR data using Milvus's vector store (development use only)

The standard API may not expose all available functions. 
For maximum flexibility, you can interact directly with the vector store using `PyMilvus`.

For more details, refer to the [Milvus documentation](https://milvus.io/docs)

### Requirements

1. Store the milvus server information in a `.env` file, make sure your `.gitignore` is ignoring `.env`.

    ```sh
    MILVUS_HOST=vcrge-dsi-027313.cci.wisc.edu
    MILVUS_PORT=19530
    MILVUS_DB_NAME=dev
    MILVUS_TOKEN=user:your_password  # Update this line
    ```

1. Add `pymilvus>=2.5.14` and `python-dotenv` to your project dependency. For pure python I recommend using `uv`, for project that need conda-forge perhaps use `pixi`

1. You must be within the UW network (Use UW's VPN if you are not)



In [None]:
import os
from dotenv import load_dotenv
from pymilvus import MilvusClient

load_dotenv()


In [None]:
client = MilvusClient(
    uri=f"tcp://{os.getenv('MILVUS_HOST')}:{os.getenv('MILVUS_PORT', 19530)}",
    token=os.getenv("MILVUS_TOKEN", ""),
    db_name=os.getenv("MILVUS_DB_NAME", "dev"),
)

In [None]:
client.list_collections()

In [None]:
client.query("person", filter="display_name == 'Lisa A. Frank'")

In [None]:
client.get("work", ids=["https://openalex.org/W10005870"])

You should be able to print some data using the above snippet. For more details of how to use `MilvusClient`, refer to the [Milvus documentation](https://milvus.io/docs)

### Searching with embedding

In [None]:
import httpx

query = "machine learning"
BEAR_API_BASE_URL = "https://bear-api.services.dsi.wisc.edu"

response = httpx.post(f"{BEAR_API_BASE_URL}/embed", json={"texts": [query]})
query_embeddings = response.json()["embeddings"]
len(
    query_embeddings[0]
)  # Depends on what the embedding BEAR currently is set to 1024 currently with `intfloat/multilingual-e5-large-instruct`, I may change it later...

In [None]:
client.search(
    collection_name="work",
    data=query_embeddings,
    limit=3,
    output_fields=["title", "authors", "publication_year"],
)