## License Plate Detection using Fuzzy Join 

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/georgia-tech-db/eva/blob/master/tutorials/09-license-plate-fuzzy-join.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" /> Run on Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/georgia-tech-db/eva/blob/master/tutorials/09-license-plate-fuzzy-join.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" /> View source on GitHub</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/georgia-tech-db/eva/raw/master/tutorials/09-license-plate-fuzzy-join.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" /> Download notebook</a>
  </td>
</table><br><br>

### Connect to EvaDB

In [1]:
%pip install --quiet "evadb[vision,notebook]"
import evadb
cursor = evadb.connect().cursor()

### Loading the images to EvaDB for analysis

In [2]:
# Download images
!wget -nc "https://www.dropbox.com/s/770stddqfl0psog/license.zip"
!unzip -n license.zip

cursor.query('DROP TABLE IF EXISTS MyImages;').df()

cursor.load("license/Car*.png", "MyImages", format="image").df()

File ‘license.zip’ already there; not retrieving.

Archive:  license.zip


Unnamed: 0,0
0,Number of loaded IMAGE: 7


### License Plate Recognition

In [3]:
cursor.query("""DROP UDF IF EXISTS OCRExtractor;""").df()

cursor.query("""DROP UDF IF EXISTS FuzzDistance;""").df()

cursor.create_udf("OCRExtractor", True, '../evadb/udfs/ocr_extractor_HuggingFace.py').df()
cursor.create_udf("FuzzDistance", True, '../evadb/udfs/fuzzy_join.py').df()

Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.


Unnamed: 0,0
0,UDF FuzzDistance successfully added to the dat...


In [4]:
cursor.query(
    "CREATE TABLE IF NOT EXISTS LicensePlateCSV(id INTEGER UNIQUE, label TEXT(30));"
).df()



In [5]:
cursor.load("data.csv", "LicensePlateCSV",format="csv").df()

Unnamed: 0,CSV,Number of loaded frames
0,data.csv,5


In [6]:
query = cursor.table("MyImages")
query = query.select("OCRExtractor(data)")
query.df()

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

2023-06-13 21:03:31,716	INFO worker.py:1625 -- Started a local Ray instance.


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


[2m[36m(ray_parallel pid=48380)[0m Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.


Unnamed: 0,ocrextractor.ocr_data
0,DZI7 YXR
1,PREANUP
2,KLC01CA2555
3,PG MN112
4,alanystockphoto
5,PU18 BES
6,VIRGINIA 80211N人口密度 67 મોત


In [7]:
query = cursor.table("LicensePlateCSV")
query = query.select("*")

query.df()

Unnamed: 0,licenseplatecsv._row_id,licenseplatecsv.id,licenseplatecsv.label
0,1,1,KLG1CA2555
1,2,2,PGMN112
2,3,3,PRENUP
3,4,4,DZ17YXR
4,5,5,PUI8BES
5,6,1,KLG1CA2555
6,7,2,PGMN112
7,8,3,PRENUP
8,9,4,DZ17YXR
9,10,5,PUI8BES


### Run Fuzzy Join to match Detected License Plate against Local License Plate Database (csv)

In [8]:
cursor.query("""
   SELECT * FROM MyImages 
       JOIN LATERAL OCRExtractor(data) AS T(a) 
       JOIN LicensePlateCSV B 
       ON FuzzDistance(T.a, B.label) > 50;
       """).df()

[2m[36m(ray_parallel pid=48381)[0m Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.


Unnamed: 0,myimages._row_id,myimages.name,myimages.data,B._row_id,B.id,B.label,T.a
0,3,license/Cars0.png,"[[[25, 75, 100], [73, 130, 159], [52, 127, 158...",1,1,KLG1CA2555,KLC01CA2555
