<a href="https://www.kaggle.com/code/patimejia/fastai-02-production-test-writing-0-00-1?scriptVersionId=118156167" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [1]:
!python -m pip list | grep fast

fastai                                2.7.10
fastapi                               0.89.1
fastavro                              1.5.2
fastcore                              1.5.27
fastdownload                          0.0.7
fasteners                             0.17.3
fastjsonschema                        2.15.3
fastprogress                          1.0.3
fasttext                              0.9.2
pyfasttext                            0.4.6


check for duck duck go

In [2]:
!python -m pip list | grep 'du\|dd'

giddy                                 2.3.3
google-cloud-scheduler                2.6.4
mdurl                                 0.1.0
pyasn1-modules                        0.2.7
pydub                                 0.25.1
sklearn-contrib-py-earth              0.1.0+1.gdde5f89
tensorflow-addons                     0.14.0


Therefore, there is no installation of `duckduckgo` or any library with `du` or `dd` in its filename.

Install `duckduckgo_search` with `pip`:

In [3]:
!pip install -Uqq duckduckgo_search


- `pip` is a recursive acronym that can stand for either "Pip Installs Packages" or "Pip Installs Python".
- `pip` is a command line tool for installing and managing Python packages, otherwise known as modules or libraries. Non-Python packages can also be installed using `pip` such as Java, C, C++, and Fortran libraries.
- `install` is a subcommand of `pip` that installs packages.
- `-U` is an option of `pip install` that upgrades the package to the latest version.
- `-qq` is an option of `pip install` that suppresses the output of the command.
- `duckduckgo_search` is the name of the package to install. It is a Python wrapper for the DuckDuckGo search engine and is available on PyPI. 
- `duckduckgo_search` is a dependency of `fastai` and is not installed by default.
- [documentation](https://pypi.org/project/duckduckgo-search/#3-ddg_images---image-search-by-duckduckgocom) for `duckduckgo_search`


In [4]:
!python -m pip list | grep 'du\|dd'

duckduckgo-search                     2.8.0
giddy                                 2.3.3
google-cloud-scheduler                2.6.4
mdurl                                 0.1.0
pyasn1-modules                        0.2.7
pydub                                 0.25.1
sklearn-contrib-py-earth              0.1.0+1.gdde5f89
tensorflow-addons                     0.14.0


shows `duckduckgo-search` verision 2.8.0 has been installed 

# test installations

In [5]:
def test_imports():
    try:
        import fastai
        import duckduckgo_search
    except ImportError as e:
        print(f'Import failed: {e}')
        return

    print(f'fastai version: {fastai.__version__}')
#     print(f'fastcore version: {fastcore.__version__}')
    print(f'duckduckgo_search version: {duckduckgo_search.__version__}')
#     print(f'fastdownload version: {fastdownload.__version__}')
#     print(f'PIL version: {PIL.__version__}')
    print('Success! All calls to imports were successful.')


test_imports()

fastai version: 2.7.10
duckduckgo_search version: 2.8.0
Success! All calls to imports were successful.


# Create the search image function

In [6]:
from duckduckgo_search import ddg_images
from fastai.vision.all import *


def search_image_urls(term:str, max_images:int, min_sz:int)->List:
    print(f"Searching for {term} images ...")
    results = ddg_images(term, max_results=max_images)
    images = [result.get('image') for result in results if result.get('width')>min_sz and result.get('height')>min_sz]
    return L(images).unique().sorted()

The code imports the `ddg_images` function from the `duckduckgo_search"`library and the `L` function and everything else from the `fastai.vision.all` module.

A function `search_image_urls` is defined that takes in three arguments:

- `term`: a string representing the search term.
- `max_images`: the maximum number of images to be returned.
- `min_sz`: the minimum size of images to be returned.

The code prints a message indicating that it is searching for images using the search term.

The `ddg_images` function is called with the search term and the maximum number of images to be returned, and the result is stored in the `results` variable.

A list comprehension is used to filter the `results` to only include images that have a width and height greater than the `min_sz` value. The filtered images are stored in the `images` variable.

The code creates a fastai.core.List object using the `L` function and applies the `unique()` and `sorted()` methods to it to remove duplicate URLs and sort the list, respectively.

The result is returned as the output of the function.

In [7]:
term = 'parsley seeds'
min_sz=128
max_images=150 
number_of_results_to_print = 2

url_list = search_image_urls(term, max_images, min_sz) 

Searching for parsley seeds images ...


In [8]:
def test_search_image_urls(number_of_results_to_print): 
    try:
        url_list = search_image_urls(term, max_images, min_sz)
    except Exception as e:
        print(f'Test failed: {e}')
        return

    print(f'Number of images found: {len(url_list)}')
    print(f'Number of duplicates: {len(url_list)-len(url_list.unique())}')
    print(f'Number of images dropped due to size: {max_images-len(url_list)}')
    print(f'Number of images kept: {len(url_list)}')
    print(f'Number of images to print: {number_of_results_to_print}')
    print(f'Number of images printed: {len(url_list[:number_of_results_to_print])}')
    print('Success! The search_image_urls function works as expected.')
    print(f'Here are some sample urls: {url_list[:number_of_results_to_print]}')

# test the search image function

In [9]:
test_search_image_urls(3)

Searching for parsley seeds images ...
Number of images found: 150
Number of duplicates: 0
Number of images dropped due to size: 0
Number of images kept: 150
Number of images to print: 3
Number of images printed: 3
Success! The search_image_urls function works as expected.
Here are some sample urls: ['http://1.bp.blogspot.com/-T6OZMO6xVQU/U8SkIpbh7OI/AAAAAAAADvQ/AZWwjeiR1ec/s1600/Parsley+seeds+harvested.jpg', 'http://2.bp.blogspot.com/-ear3EqvPM00/UtcWHt5dCAI/AAAAAAAAWT0/JDV4cDxQ8g8/s1600/Parsley2.jpg', 'http://2.bp.blogspot.com/_ElwJb9bX2wc/TIU0o-sbsnI/AAAAAAAAAjQ/btV3ZvHZR8c/s1600/DSC00602.JPG']
