# ⏰ Install & Import Dependencies

In [2]:
!pip install docarray

Collecting docarray
  Downloading docarray-0.13.10.tar.gz (627 kB)
[?25l[K     |▌                               | 10 kB 23.1 MB/s eta 0:00:01[K     |█                               | 20 kB 29.1 MB/s eta 0:00:01[K     |█▋                              | 30 kB 22.0 MB/s eta 0:00:01[K     |██                              | 40 kB 17.2 MB/s eta 0:00:01[K     |██▋                             | 51 kB 8.1 MB/s eta 0:00:01[K     |███▏                            | 61 kB 9.5 MB/s eta 0:00:01[K     |███▋                            | 71 kB 9.0 MB/s eta 0:00:01[K     |████▏                           | 81 kB 9.3 MB/s eta 0:00:01[K     |████▊                           | 92 kB 10.3 MB/s eta 0:00:01[K     |█████▎                          | 102 kB 8.6 MB/s eta 0:00:01[K     |█████▊                          | 112 kB 8.6 MB/s eta 0:00:01[K     |██████▎                         | 122 kB 8.6 MB/s eta 0:00:01[K     |██████▉                         | 133 kB 8.6 MB/s eta 0:00:01[K    

In [5]:
# Importing necessary dependencies
from docarray import Document, DocumentArray

# 🪡 Data Pre-processing

In [6]:
# break large text into smaller chunks
docs = DocumentArray(Document(text = s.strip()) for s in doc.text.split('\n') if s.strip())

# 🏗 Generate Vector Embeddings 

We use **feature hashing** to generate the vecor embeddings as its the faster and space-efficient way. It works by taking the features and applying a hash function that can hash the values and return them as indices.

In [7]:
# apply feature hashing to embed the DocumentArray
docs.apply(lambda doc: doc.embed_feature_hashing())

# 🪄 Querying the Data 

Let's take the query sentence "**she entered the room**" from Pride and Prejudice and see what response we get.

In [10]:
# query sentence 
query = (Document(text="she entered the room").embed_feature_hashing().match(docs, limit=3, exclude_self=True, 
metric="jaccard", use_scipy=True))

In [28]:
# fetch the output
output = query.matches[:, ('text', 'scores__jaccard')][0]

In [29]:
# print the results
for i in (output):
  print(i)

staircase, than she entered the breakfast-room, and congratulated
of the room.
She entered the room with an air more than usually ungracious,


# Next Steps

### Building into a real world application

In a future notebook we'll use **[Jina's neural search framework](https://github.com/jina-ai/jina/)** and **[Jina Hub Executors](https://hub.jina.ai)** to build a [real world fashion search engine](http://examples.jina.ai/fashion) with minimal lines of code.

![](https://github.com/alexcg1/jina-multimodal-fashion-search/raw/main/demo.gif)