Milvus Kernel for Jupyter
pip install milvus_kernel
To get the newest one from this repo (note that we are in the alpha stage, so there may be frequent updates), type:
pip install git+git://github.com/Hourout/milvus_kernel.git
Add kernel to your jupyter:
python3 -m milvus_kernel.install
ALL DONE! πππ
View and remove milvus kernel
jupyter kernelspec list
jupyter kernelspec remove milvus
uninstall milvus kernel:
pip uninstall milvus-kernel
ALL DONE! πππ
jupyter notebook
-
Create a client to Milvus server by using the following methods:
milvus://127.0.0.1:19530
Create collection test01
with dimension size as 128, size of the data file for Milvus to automatically create indexes as 1024, and metric type as Euclidean distance (L2).
create table test01 where dimension=128 and index_file_size=1024 and metric_type='L2'
drop table test01
You can split collections into partitions by partition tags for improved search performance. Each partition is also a collection.
create partition test01 where partition_tag='tag01'
verify whether the partition is created.
list partitions test01
drop partition test01 where partition_tag='tag01'
Note: In production, it is recommended to create indexes before inserting vectors into the collection. Index is automatically built when vectors are being imported. However, you need to create the same index again after the vector insertion process is completed because some data files may not meet the
index_file_size
and index will not be automatically built for these data files.
Create an index for the collection. The following command uses IVF_FLAT
index type as an example
create index test01 where index_type='FLAT' and nlist=4096
drop index test01
-
Insert a vector. If you do not specify vector ids, Milvus automatically generates IDs for the vectors.
insert 2,3,5 from test01
Alternatively, you can also provide user-defined vector ids:
insert 2,3,5 from test01 where by id=0
insert 2,3,5 from test01 where partition_tag='tag01' by id=0
To verify the vectors you have inserted. Assume you have vector with the following ID.
select test01 by id=1,2,3
You can delete these vectors by:
delete test01 by id=1
When performing operations related to data changes, you can flush the data from memory to disk to avoid possible data loss. Milvus also supports automatic flushing, which runs at a fixed interval to flush the data in all collections to disk. You can use the Milvus server configuration file to set the interval.
flush test01, test02
A segment is a data file that Milvus automatically creates by merging inserted vector data. A collection can contain multiple segments. If some vectors are deleted from a segment, the space taken by the deleted vectors cannot be released automatically. You can compact segments in a collection to release space.
compact test01
select 2, 3, 5 from test01 where top_k=1 and partition_tags='tag01' and nprobe=16
kernel logo