# __Search Image by Keyword__

- Tutorial Difficulty: ★☆☆☆☆
- 10 min read
- Languages: [SQL](https://en.wikipedia.org/wiki/SQL) (100%)
- File location: tutorial_en/thanosql_search/search_image_by_keyword.ipynb
- References: [Food Image and Nutrition Text Introduction Dataset](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=74)

## Tutorial Introduction

<div class="admonition note">
    <h4 class="admonition-title">Understanding Keyword-Image Search</h4>
    <p>ThanoSQL provides image search using keywords. The search uses an image classification model to set a keyword as the target value, then adds an index column with the images updated from the trained model. In other words, the keyword-image search finds images that correspond to the desired target value(category). </p>
</div>

The dictionary defines "search" as "finding the necessary materials in a book or computer according to its purpose." The ThanoSQL keyword-image search does not search for information that includes a specific word(keyword). Instead, it creates a model that predicts words from the features of an image and returns the image with the highest relevance.

__The following are examples and applications of the ThanoSQL keyword image search algorithm.__

- Use shopping categories to create a model, and use it to create an index column. The index column combined with attributes such as image registration dates, provides more accurate searches.
- You can create your own image search service by utilizing image-image search and text-image search from their respective tutorials.

<div class="admonition note">
    <h4 class="admonition-title"> In This Tutorial</h4>
    <p>👉 Use a combination of the "<strong>SEARCH</strong>" query and the "<strong>SELECT</strong>" query to search images using specific keywords. </p>
</div>

<div class="admonition tip">
    <h4 class="admonition-title">Dataset Description</h4>
    <p>The Introduction to Food Images and Nutrition Information Text dataset was organized by the Ministry of Science and ICT and is supported by the Korea Intelligence Information Society Agency. It consists of 400 food items and 842,000 images. This tutorial uses only a few(10 types, 1,190 photos) images from that dataset. </p>
</div>

## __0. Prepare Dataset__

As mentioned in the [ThanoSQL Workspace](https://docs.thanosql.ai/en/getting_started/paas/workspace/lab/), you must create an API token and run the query below to execute the query of ThanoSQL. 

In [None]:
%load_ext thanosql
%thanosql API_TOKEN=<Issued_API_TOKEN>

### __Prepare Dataset__

In [2]:
%%thanosql
GET THANOSQL DATASET diet_data
OPTIONS (overwrite=True)

Success


<div class="admonition note">
    <h4 class="admonition-title">Query Details</h4>
    <ul>
        <li>"<strong>GET THANOSQL DATASET</strong>" downloads the specified dataset to the workspace.</li>
        <li>"<strong>OPTIONS</strong>" specifies the option values to be used for the <strong>GET THANOSQL DATASET</strong> clause.
        <ul>
            <li>"overwrite": determines whether to overwrite a dataset if it already exists. If set as True, the old dataset is replaced with the new dataset (bool, optional, True|False, default: False)</li>
        </ul>
        </li>
    </ul>
</div>

In [3]:
%%thanosql
COPY diet 
OPTIONS (if_exists='replace')
FROM 'thanosql-dataset/diet_data/diet.csv'

Success


<div class="admonition note">
    <h4 class="admonition-title">Query Details</h4>
    <ul>
        <li>"<strong>COPY</strong>" specifies the name of the dataset to be saved as a database table.</li>
        <li>"<strong>OPTIONS</strong>" specifies the option values to be used for the <strong>COPY</strong> clause.
        <ul>
           <li>"if_exists": determines how the function should handle the case where the table already exists, it can either raise an error, append to the existing table, or replace the existing table (str, optional, 'fail'|'replace'|'append', default: 'fail')</li>
        </ul>
        </li>
    </ul>
</div>

## __1. Check Dataset__

To create a keyword-image search model, we use the __diet__ table located in the ThanoSQL workspace database. Run the query below to check the contents of the table.

In [4]:
%%thanosql
SELECT * 
FROM diet
LIMIT 5

Unnamed: 0,image_path,label
0,thanosql-dataset/diet_data/diet/백향과/0_A220148X...,백향과
1,thanosql-dataset/diet_data/diet/백향과/0_A220148X...,백향과
2,thanosql-dataset/diet_data/diet/백향과/1_A220148X...,백향과
3,thanosql-dataset/diet_data/diet/백향과/0_A220148X...,백향과
4,thanosql-dataset/diet_data/diet/백향과/0_A220148X...,백향과


<div class="admonition note">
    <h4 class="admonition-title">Understanding the Data Table</h4>
    <p>The <strong>diet</strong> table contains the following information. </p>
    <ul>
        <li>image_path: image path </li>
        <li>label: image label</li>
    </ul>
</div>

## __2. Build an Image Search Model__

To create an image search model with the name __diet_image_classification__ using the __diet__ table, run the following query.  
(Estimated duration of query execution: 3 min)

In [5]:
%%thanosql
BUILD MODEL diet_image_classification
USING ConvNeXt_Tiny
OPTIONS (
    image_col='image_path', 
    label_col='label', 
    max_epochs=1,
    overwrite=True
    )
AS 
SELECT *
FROM diet

Success


<div class="admonition note">
    <h4 class="admonition-title">Query Details</h4>
    <ul>
        <li>"<strong>BUILD MODEL</strong>" creates and trains a model named <strong>diet_image_classification</strong>.</li>
        <li>"<strong>USING</strong>" specifies <strong>ConvNeXt_Tiny</strong> as the base model.</li>
        <li>"<strong>OPTIONS</strong>" specifies the option values used to create a model.
        <ul>
            <li>"image_col": the name of the column containing the image path (str, default: 'image_path')</li>
            <li>"label_col": the name of the column containing information about the target (str, default: 'label')</li>
            <li>"max_epochs": number of times to train with the training dataset (int, optional, default: 3)</li>
            <li>"overwrite": determines whether to overwrite a model if it already exists. If set as True, the old model is replaced with the new model (bool, optional, True|False, default: False) </li>
        </ul>
        </li>
    </ul>
</div>

## __3. Predict__

To use the __diet_image_classification__ model created in the previous step for prediction of __diet__, run the following query.

In [6]:
%%thanosql
PREDICT USING diet_image_classification
OPTIONS (
    image_col='image_path',
    result_col='predict_result'
    )
AS 
SELECT *
FROM diet

Unnamed: 0,image_path,label,predict_result
0,thanosql-dataset/diet_data/diet/백향과/0_A220148X...,백향과,백향과
1,thanosql-dataset/diet_data/diet/백향과/0_A220148X...,백향과,백향과
2,thanosql-dataset/diet_data/diet/백향과/1_A220148X...,백향과,백향과
3,thanosql-dataset/diet_data/diet/백향과/0_A220148X...,백향과,백향과
4,thanosql-dataset/diet_data/diet/백향과/0_A220148X...,백향과,백향과
...,...,...,...
1185,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
1186,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
1187,thanosql-dataset/diet_data/diet/사과파이/1_A020511...,사과파이,사과파이
1188,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이


<div class="admonition note">
    <h4 class="admonition-title">Query Details</h4>
    <ul>
        <li>"<strong>PREDICT USING</strong>" predicts the outcome using the <strong>diet_image_classification</strong> model.</li>
        <li>"<strong>OPTIONS</strong>" specifies the option values used to predict with the model.
        <ul>
            <li>"image_col": the name of the column containing the image path (str, default: 'image_path')</li>
            <li>"result_col": the column that contains the predicted results (str, optional, default: 'predict_result')</li>
        </ul>
        </li>
    </ul>
</div>

## __4. Search__

To retrieve data with specific conditions, run a query using the "__PREDICT USING__", "__SELECT__", "__WHERE__" clauses. You can search data where the label is '사과파이'('apple pie') and where the prediction result is also '사과파이' by running the following query.

In [7]:
%%thanosql
SELECT *
FROM (
    PREDICT USING diet_image_classification
    AS
    SELECT *
    FROM diet
    )
WHERE label=predict_result
AND label LIKE '사과파이'
LIMIT 10

Unnamed: 0,image_path,label,predict_result
0,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
1,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
2,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
3,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
4,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
5,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
6,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
7,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
8,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이
9,thanosql-dataset/diet_data/diet/사과파이/0_A020511...,사과파이,사과파이


<div class="admonition note">
    <h4 class="admonition-title">Query Details</h4>
    <ul>
        <li>"<strong>SELECT * FROM (...)</strong>" selects all the results of the nested "<strong>PREDICT USING</strong>" query.</li>
        <li>"<strong>WHERE</strong>" sets the selection condition. "<strong>AND</strong>" allows multiple conditions.
        <ul>
            <li>"label=predict_result": queries only data where the label column and predict_result column are equal</li>
            <li>"label LIKE '사과파이'": queries data where the label value is 'apple pie'</li>
        </ul>
        </li>
    </ul>
</div>

## **5. In Conclusion**

In this tutorial, we created an image search model to search for food images from the food image dataset using keywords. As this is a beginner-level tutorial, we focused on the process rather than accuracy. The model's accuracy can be improved by adjusting various options, such as increasing the epoch or dataset size. Furthermore, follow along with the image-image and image-text search tutorials to create your own search services.

* [How to Upload My Data to the ThanoSQL Workspace](https://docs.thanosql.ai/en/getting_started/data_upload/)
* [How to Create a Table Using My Data](https://docs.thanosql.ai/en/how-to_guides/ThanoSQL_query/COPY_SYNTAX/)
* [How to Upload My Model to the ThanoSQL Workspace](https://docs.thanosql.ai/en/how-to_guides/ThanoSQL_query/UPLOAD_MODEL_SYNTAX/)

<div class="admonition tip">
    <h4 class="admonition-title">Inquiries About Deploying a Model for Your Own Service</h4>
    <p>If you have any difficulties creating your own model using ThanoSQL or applying it to your services, please feel free to contact us below😊</p>
    <p>For inquiries regarding building a keyword-image search models: <a href="mailto:contact@smartmind.team">contact@smartmind.team</a></p>
</div>