<a href="https://colab.research.google.com/github/naveedkhalid091/Learn_Agentic_AI/blob/main/step02_generative_ai_for_beginners/02(c)_Advance_RAG_Picture%26audio_recognition_ipynb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Face & Audio Detection with Embeddings:**

With introduction of this face detection technology, AI can now setup **security system** for not only Humans but also for the Animals as well.

You can introduce Animal's Passport with the help of this technology becasue every animal has unique attributes from thier nose.   

- You can install the **`facenet`** for face dedection.
- You can also install the **`YAMNet`**  for voice/sound recognition.

With the Introduction of above two technologies you can now ask LLM to make **written notes**  for you from a single video (Zoom Lecture) with the help of `facenet` & `YAMNet`. **`facenet`** will recognise the face from the video and YAMnet will regognise the voice of that person and LLM will produce written notes based on the wordings of your ZOOM class.





In [4]:
!pip install -U -q facenet-pytorch

In [6]:
!pip install -U -q pillow

In [7]:
import torch # this is by defalut installed in colab files

import torch.nn as nn  ## nn means neural network
import torchvision.transforms as transforms
from PIL import Image


In [None]:
from facenet_pytorch import MTCNN, InceptionResnetV1 # this is architecture

model = InceptionResnetV1(pretrained='vggface2').eval()
model  # The complete architechture of this model is here now

**Note: Now if we will give any Human or animal piture to above model then that model will embedd the important features of the relevant face.**

In [9]:
# preprocessing function to transform the Image into a tensor
def preprocess_image(image_path):
  image=Image.open (image_path).convert('RGB')
  preprocess = transforms.Compose([
      transforms.Resize((224,224)),
      transforms.ToTensor(),
      transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
  ])
  return preprocess(image).unsqueeze(0)

In [17]:
# Function to create image embeddings

def create_image_embeddings(image_path):
  try:
    input_tensor = preprocess_image(image_path)
    with torch.no_grad():
      embeddings = model(input_tensor) # embedding important line
      return embeddings.squeeze().numpy()
  except Exception as e:
      print("Error:",e)
      return None

In [18]:
!mkdir images # Create images folder in directory

In [19]:
# importing pictures from url

import requests
import os

def save_image_from_url(image_url, image_name):
  """
  Download an image from URL and saves it to the "image" folder.

  Agrs:
  image_url: The url of thre image to download
  image_name: The name of file to save the image as:
  """
  try:
    if not os.path.exists("images"):
      os.makedirs("images")

    image_path=os.path.join("images",image_name)
    response=requests.get(image_url, stream=True)
    response.raise_for_status() # Raise exception for bad status codes:

    with open(image_path, 'wb') as file:
      for chunk in response.iter_content(chunk_size=8192):
        file.write(chunk)

    print(f"Image saved to: {image_path}")
  except requests.exceptions.RequestException as e:
    print(f"Error downloading image: {e}")
  except Exception as e:
    print(f"Error saving image: {e}")

In [20]:
save_image_from_url("https://www.facebook.com/photo/?fbid=6506220269475611&set=a.187728274658207", "Khalid.jpg")

Image saved to: images/Khalid.jpg


In [None]:
# Example usage/ create embedding of one picture

image_path2="/content/images/Abu_1.JPG"
abu=create_image_embeddings(image_path2)

print("Image embedding shape:", abu.shape) # its lenght will be 512 which is determined through shape.
print("Image Embedding:",abu)

In [33]:
abu_1=create_image_embeddings("/content/images/Abu_1.JPG")
abu_2=create_image_embeddings("/content/images/Abu_3.jpg")
nav_1=create_image_embeddings("/content/images/Naveed_1.jpg")
nav_2=create_image_embeddings("/content/images/Naveed_2.jpg")
nav_3=create_image_embeddings("/content/images/Naveed_3.jpg")

## **Save all above image embeddings into `Milvus-lite` Database:**


In [25]:
!pip install -U -q milvus-lite

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.2/45.2 MB[0m [31m16.9 MB/s[0m eta [36m0:00:00[0m
[?25h

In [26]:
!pip install -U -q pymilvus

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m222.4/222.4 kB[0m [31m5.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.9/5.9 MB[0m [31m52.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.6/53.6 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[?25h

In [29]:
from pymilvus import MilvusClient

client=MilvusClient("./milvus_demo.db")  # this will create a database in repository.

In [37]:
import numpy as np

client.create_collection(
    collection_name="security_system",
    dimension=512 # dimension will be equals to the lenght of vectors.
)

In [38]:
data=[
    {"id":1,"person_name":"Khalid","vector":abu_1},
    {"id":2,"person_name":"Khalid","vector":abu_2},
    {"id":3,"person_name":"Naveed","vector":nav_1},
    {"id":4,"person_name":"Naveed","vector":nav_2},
    {"id":5,"person_name":"Naveed","vector":nav_3}
 ]

In [39]:
res=client.insert(
    collection_name="security_system",
    data=data
)

In [40]:
res=client.search(
    collection_name="security_system",
    data=[abu_1],
    limit=1,
    output_fields=["id","person_name"]
)

print(res)

data: ["[{'id': 1, 'distance': 1.0, 'entity': {'person_name': 'Khalid', 'id': 1}}]"]


## Upload a different Pic and ask the model to search about the person in your database.  

In [42]:
# upload a different picture & create its embeddings

abu_3=create_image_embeddings("/content/images/abu_4.jpg")

In [43]:
res=client.search(
    collection_name="security_system",
    data=[abu_3],
    limit=1,
    output_fields=["id","person_name"]
)

print(res)

data: ["[{'id': 2, 'distance': 0.6875054836273193, 'entity': {'person_name': 'Khalid', 'id': 2}}]"]


**Note: You have now uploaded a different picture of a person and the AI model is recognized him from its database.
You can now create a security system based on your database, like Nadra system.**


## **Recognizing voice of a person using `YAMnet`.**


You can use the YAMnet for voice recognition and the code will be same.

All the code is available in below link:  

[Sir Qasim Repo](https://github.com/EnggQasim/5_days_AI_Agents_Training/blob/main/03_Image_Sound_RAG_Ollama_fastapi/03_voice_embedding.ipynb)