1. The pipeline :
A function which takes in the image, applies DeepFace ( which uses retinaface and ArcFace in the background.)
Crops for each image, and then for each of those faces embedded those faces. An image is taken a an input and the output is a list of face crops, in whatever way deepface will support it. 
The function takes in an image_path, and returns a list of dictionaries, where each dictionary corresponds to one detected face. 

2. Take a list of images 

In [2]:
from deepface import DeepFace
import cv2

In [None]:
def extract_faces_and_embeddings(image_path):
    # Load the image (fast + returns numpy array)
    img = cv2.imread(image_path)

    # Detect faces + extract embeddings
    results = DeepFace.represent(
        img_path=image_path,
        model_name="ArcFace",
        detector_backend="retinaface",
        enforce_detection=True
    )

    # Print a compact summary
    print("\nDetected faces:")
    for i, res in enumerate(results):
        print(f"Face {i+1}:")
        print(f"  Confidence: {res['face_confidence']:.3f}")
        print(f"  Facial area: {res['facial_area']}")
        print(f"  Embedding preview: {res['embedding'][:5]} ...\n")

    output = []
    #For each detected face in the results.
    for res in results:
        #Crop out the facial area for each element of the results.
        fa = res['facial_area']
        #Get the x,y,w,h coords 
        x, y, w, h = fa["x"], fa["y"], fa["w"], fa["h"]
        # Crop the face
        face_crop = img[y:y+h, x:x+w]

        # Save crop and embedding
        output.append({
            "face_crop": face_crop,
            "embedding": res['embedding']
        })
    
    return output


In [8]:
extract_faces_and_embeddings("../data/query.JPG")


Detected faces:
Face 1:
  Confidence: 1.000
  Facial area: {'x': 1155, 'y': 337, 'w': 299, 'h': 395, 'left_eye': (1380, 512), 'right_eye': (1248, 489), 'nose': (1309, 571), 'mouth_left': (1356, 628), 'mouth_right': (1233, 609)}
  Embedding preview: [-0.006661613006144762, 0.5733016133308411, -0.062371306121349335, 0.05965585634112358, -0.14647530019283295] ...

Face 2:
  Confidence: 1.000
  Facial area: {'x': 1007, 'y': 869, 'w': 287, 'h': 429, 'left_eye': (1204, 1048), 'right_eye': (1066, 1056), 'nose': (1131, 1147), 'mouth_left': (1206, 1188), 'mouth_right': (1090, 1196)}
  Embedding preview: [-0.08423464000225067, 0.2887779474258423, -0.17601321637630463, -0.1377624273300171, -0.038235586136579514] ...

Face 3:
  Confidence: 1.000
  Facial area: {'x': 1843, 'y': 521, 'w': 275, 'h': 394, 'left_eye': (2049, 690), 'right_eye': (1918, 672), 'nose': (1978, 743), 'mouth_left': (2021, 825), 'mouth_right': (1914, 810)}
  Embedding preview: [0.11335430294275284, 0.04925526678562164, -0.0444

[{'face_crop': array([[[ 2,  7, 10],
          [ 1,  6,  9],
          [ 0,  4,  5],
          ...,
          [ 0,  4,  7],
          [ 0,  5,  8],
          [ 0,  3,  4]],
  
         [[ 1,  6,  9],
          [ 0,  4,  7],
          [ 0,  3,  4],
          ...,
          [ 3,  7, 12],
          [ 3,  6, 11],
          [ 0,  3,  7]],
  
         [[ 0,  4,  5],
          [ 0,  2,  3],
          [ 0,  1,  2],
          ...,
          [ 4,  7, 15],
          [ 5,  8, 13],
          [ 1,  4,  9]],
  
         ...,
  
         [[21, 22, 18],
          [19, 17, 17],
          [14, 12, 18],
          ...,
          [ 9, 13,  8],
          [ 6, 10,  5],
          [ 5,  6,  2]],
  
         [[21, 19, 18],
          [22, 20, 20],
          [15, 13, 19],
          ...,
          [ 7, 13,  8],
          [ 6, 12,  7],
          [ 6, 10,  5]],
  
         [[19, 17, 16],
          [17, 12, 13],
          [10,  6, 11],
          ...,
          [ 7, 14,  9],
          [ 6, 13,  8],
          [ 4,  9,  

The output is a list of dictionaries with Confidence, Facial Area and Embedding for each face. 