# [Late Chunking](https://jina.ai/news/late-chunking-in-long-context-embedding-models)

This notebooks explains how the "Late Chunking" can be implemented. First you need to install the requirements:

In [None]:
!pip install transformers==4.43.4

Then we load a model which we want to use for the embedding. We choose `jinaai/jina-embeddings-v2-base-en` but any other model which supports mean pooling is possible. However, models with a large maximum context-length are preferred.

In [None]:
from transformers import AutoModel
from transformers import AutoTokenizer

# load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embeddings-v2-base-en', trust_remote_code=True)
model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-en', trust_remote_code=True)

Now we define the text which we want to encode and split it into chunks. The `chunk_by_sentences` function also returns the span annotations.
Those specify the number of tokens per chunk which is needed for the chunked pooling.

In [3]:
def chunk_by_sentences(input_text: str, tokenizer: callable):
    """
    Split the input text into sentences using the tokenizer
    :param input_text: The text snippet to split into sentences
    :param tokenizer: The tokenizer to use
    :return: A tuple containing the list of text chunks and their corresponding token spans
    """
    inputs = tokenizer(input_text, return_tensors='pt', return_offsets_mapping=True)
    punctuation_mark_id = tokenizer.convert_tokens_to_ids('.')
    sep_id = tokenizer.convert_tokens_to_ids('[SEP]')
    token_offsets = inputs['offset_mapping'][0]
    token_ids = inputs['input_ids'][0]
    chunk_positions = [
        (i, int(start + 1))
        for i, (token_id, (start, end)) in enumerate(zip(token_ids, token_offsets))
        if token_id == punctuation_mark_id
        and (
            token_offsets[i + 1][0] - token_offsets[i][1] > 0
            or token_ids[i + 1] == sep_id
        )
    ]
    chunks = [
        input_text[x[1] : y[1]]
        for x, y in zip([(1, 0)] + chunk_positions[:-1], chunk_positions)
    ]
    span_annotations = [
        (x[0], y[0]) for (x, y) in zip([(1, 0)] + chunk_positions[:-1], chunk_positions)
    ]
    return chunks, span_annotations

In production, you should use more advanced and robust segmentation method such as Jina AI Tokenizer API https://jina.ai/tokenizer#apiform.

In [4]:
import requests

def chunk_by_tokenizer_api(input_text: str, tokenizer: callable):
    # Define the API endpoint and payload
    url = 'https://tokenize.jina.ai/'
    payload = {
        "content": input_text,
        "return_chunks": "true",
        "max_chunk_length": "1000"
    }

    # Make the API request
    response = requests.post(url, json=payload)
    response_data = response.json()

    # Extract chunks and positions from the response
    chunks = response_data.get("chunks", [])
    chunk_positions = response_data.get("chunk_positions", [])

    # Adjust chunk positions to match the input format
    span_annotations = [(start, end) for start, end in chunk_positions]

    return chunks, span_annotations

Now let's try to segement a toy example.

In [5]:
input_text = """
## Samsung Galaxy S25 Support Documentation: Common Issues and Solutions

The Samsung Galaxy S25 series, including the S25, S25+, and S25 Ultra, is a flagship lineup packed with advanced features. However, like any new smartphone release, users have reported various issues. This support document outlines common problems, troubleshooting steps, and solutions to ensure optimal performance.

---

### **Common Issues with the Galaxy S25 Series**

#### **1. Display Issues**
- **Problems:** Users have reported scrolling lag, screen flickering, and dim brightness.
- **Causes:** These issues are often software-related or linked to display settings.
- **Solutions:**
  - Disable Always-On Display (AOD) by entering Safe Mode and uninstalling the AOD app[5][8].
  - Adjust screen brightness in *Settings > Display > Screen mode* and use the Vividness slider for preferred color tuning[8].
  - Increase touch sensitivity via *Settings > Display* if taps are missed or scrolling feels unresponsive[8].

#### **2. Overheating**
- **Problems:** Some Galaxy S25 devices experience overheating during intensive use.
- **Causes:** Likely related to the Snapdragon 8 Elite chipset or high-performance settings.
- **Solutions:**
  - Avoid using the device in direct sunlight or while charging.
  - Enable power-saving mode in *Settings > Battery* to reduce strain on the processor[3][9].
  - Check for software updates that may address chipset optimization[9].

#### **3. Camera Problems**
- **Problems:** Issues include streaks in photos taken under strong light conditions and inconsistent autofocus.
- **Causes:** Software bugs or hardware calibration errors.
- **Solutions:**
  - Update camera software via *Settings > Software Update*[3][9].
  - Clear cache for the camera app in *Settings > Apps > Camera > Storage > Clear Cache*[4].
  - If problems persist, visit a Samsung service center for hardware inspection[9].

#### **4. Battery Drain**
- **Problems:** Faster-than-expected battery depletion even under moderate usage.
- **Causes:** High refresh rates, background apps, or Always-On Display.
- **Solutions:**
  - Check battery usage in *Settings > Battery and Device Care* and restrict power-hungry apps[9].
  - Reduce refresh rate to Adaptive or Standard mode via *Settings > Display*[9].
  - Disable features like Always-On Display and Location Services when not needed[9].

#### **5. App Notification Delays**
- **Problems:** Notifications from apps like WhatsApp and Facebook are delayed.
- **Causes:** Battery optimization settings or app-specific bugs.
- **Solutions:**
  - Disable battery optimization for affected apps in *Settings > Battery > Background Usage Limits*[5].
  - Ensure Do Not Disturb mode is turned off[9].
  - Update apps to their latest versions via Google Play Store[3].

#### **6. Android Auto Connectivity Issues**
- **Problems:** Some users report that Android Auto fails to recognize the device.
- **Causes:** Compatibility issues with car head units or software bugs.
- **Solutions:**
  - Try a different USB cable or port.
  - Clear cache for Android Auto via *Settings > Apps > Android Auto > Storage > Clear Cache*[4].
  - Wait for updates from Google to address compatibility issues[3].

---

### **Troubleshooting Steps**

#### **Soft Reset (Force Restart)**
If your device becomes unresponsive:
1. Press and hold the Volume Down key + Side key simultaneously for at least 7 seconds until the phone restarts[4].

#### **Clear App Cache**
To resolve app-related issues:
1. Navigate to *Settings > Battery and Device Care*.
2. Tap "Optimize Now" to clear cache data[4].

#### **Factory Reset**
To resolve persistent problems:
1. Go to *Settings > General Management > Reset > Factory Data Reset*.
2. Confirm reset after backing up important data, as this erases all personal information[5][6].

#### **Safe Mode Troubleshooting**
For issues caused by third-party apps:
1. Restart into Safe Mode by holding Volume Down + Side key until the Samsung logo appears.
2. Uninstall problematic apps via *Settings > Apps*[4][5].

---

### **Hardware-Specific Features**

#### Galaxy S25 Ultra
The Ultra model includes unique features such as:
- S Pen functionality for advanced productivity tasks[2].
- Enhanced camera capabilities like macro photography and 8K video recording[7].

#### Wireless Charging
All models support wireless charging but may require removing accessories or cases for stable charging performance[2].

---

### **Service Center Support**
If troubleshooting fails:
1. Visit a Samsung service center for hardware repairs or replacements.
2. Check service center wait times online to minimize delays[4].

---

### Conclusion
The Samsung Galaxy S25 series offers cutting-edge technology but may encounter occasional issues typical of new devices. By following these troubleshooting steps and solutions, users can resolve most problems efficiently while maximizing their smartphone experience. For unresolved issues, professional assistance is available through Samsung service centers.



"""


# determine chunks
chunks, span_annotations = chunk_by_sentences(input_text, tokenizer)
print('Chunks:\n- "' + '"\n- "'.join(chunks) + '"')


Chunks:
- "
## Samsung Galaxy S25 Support Documentation: Common Issues and Solutions

The Samsung Galaxy S25 series, including the S25, S25+, and S25 Ultra, is a flagship lineup packed with advanced features."
- " However, like any new smartphone release, users have reported various issues."
- " This support document outlines common problems, troubleshooting steps, and solutions to ensure optimal performance."
- "

---

### **Common Issues with the Galaxy S25 Series**

#### **1."
- " Display Issues**
- **Problems:** Users have reported scrolling lag, screen flickering, and dim brightness."
- "
- **Causes:** These issues are often software-related or linked to display settings."
- "
- **Solutions:**
  - Disable Always-On Display (AOD) by entering Safe Mode and uninstalling the AOD app[5][8]."
- "
  - Adjust screen brightness in *Settings > Display > Screen mode* and use the Vividness slider for preferred color tuning[8]."
- "
  - Increase touch sensitivity via *Settings > Display* if ta

Now we encode the chunks with the traditional and the context-sensitive late_chunking method:

In [6]:
def late_chunking(
    model_output: 'BatchEncoding', span_annotation: list, max_length=None
):
    token_embeddings = model_output[0]
    outputs = []
    for embeddings, annotations in zip(token_embeddings, span_annotation):
        if (
            max_length is not None
        ):  # remove annotations which go bejond the max-length of the model
            annotations = [
                (start, min(end, max_length - 1))
                for (start, end) in annotations
                if start < (max_length - 1)
            ]
        pooled_embeddings = [
            embeddings[start:end].sum(dim=0) / (end - start)
            for start, end in annotations
            if (end - start) >= 1
        ]
        pooled_embeddings = [
            embedding.detach().cpu().numpy() for embedding in pooled_embeddings
        ]
        outputs.append(pooled_embeddings)

    return outputs

In [7]:
# chunk before
embeddings_traditional_chunking = model.encode(chunks)

# chunk afterwards (context-sensitive chunked pooling)
inputs = tokenizer(input_text, return_tensors='pt')
model_output = model(**inputs)
embeddings = late_chunking(model_output, [span_annotations])[0]

Finally, we compare the similarity of the word "Berlin" with the chunks. The similarity should be higher for the context-sensitive chunked pooling method:

In [9]:
import numpy as np

cos_sim = lambda x, y: np.dot(x, y) / (np.linalg.norm(x) * np.linalg.norm(y))

s25_ultra_embedding = model.encode('Samsung Galaxy S25 Ultra')

for chunk, new_embedding, trad_embeddings in zip(chunks, embeddings, embeddings_traditional_chunking):
    print(f'similarity_new("Samsung Galaxy S25 Ultra", "{chunk}"):', cos_sim(s25_ultra_embedding, new_embedding))
    print(f'similarity_trad("Samsung Galaxy S25 Ultra", "{chunk}"):', cos_sim(s25_ultra_embedding, trad_embeddings))

similarity_new("Samsung Galaxy S25 Ultra", "
## Samsung Galaxy S25 Support Documentation: Common Issues and Solutions

The Samsung Galaxy S25 series, including the S25, S25+, and S25 Ultra, is a flagship lineup packed with advanced features."): 0.8225856
similarity_trad("Samsung Galaxy S25 Ultra", "
## Samsung Galaxy S25 Support Documentation: Common Issues and Solutions

The Samsung Galaxy S25 series, including the S25, S25+, and S25 Ultra, is a flagship lineup packed with advanced features."): 0.8906413
similarity_new("Samsung Galaxy S25 Ultra", " However, like any new smartphone release, users have reported various issues."): 0.8114611
similarity_trad("Samsung Galaxy S25 Ultra", " However, like any new smartphone release, users have reported various issues."): 0.7030146
similarity_new("Samsung Galaxy S25 Ultra", " This support document outlines common problems, troubleshooting steps, and solutions to ensure optimal performance."): 0.7745215
similarity_trad("Samsung Galaxy S25 Ultra"

In [14]:
import numpy as np
import pandas as pd

# Define cosine similarity function
cos_sim = lambda x, y: np.dot(x, y) / (np.linalg.norm(x) * np.linalg.norm(y))

# Example embeddings and chunks (replace with actual data)

embeddings = [np.random.rand(768) for _ in chunks]
embeddings_traditional_chunking = [np.random.rand(768) for _ in chunks]
s25_ultra_embedding = np.random.rand(768)

# Collect results
results = []
for chunk, new_embedding, trad_embeddings in zip(chunks, embeddings, embeddings_traditional_chunking):
    old_score = cos_sim(s25_ultra_embedding, trad_embeddings)
    new_score = cos_sim(s25_ultra_embedding, new_embedding)
    results.append({"query text": "Samsung Galaxy S25 Ultra", "text chunk": chunk, "old score": old_score, "new score": new_score})

# Convert results to DataFrame
df = pd.DataFrame(results)

# Save the DataFrame to a CSV file
df.to_csv('results.csv', index=False)
