# Naruhodo Podcast Graph Database

**Naruhodo** is a Brazilian podcast dedicated to answering listeners’ questions about science, common sense, and curiosities. Every episode is packed with science-based content and is enriched with a diverse set of references—ranging from scientific papers and articles to books and online resources. Many episodes share overlapping themes and often reference the same sources, which makes the dataset ideal for creating an interconnected graph.

This project focuses on scraping the available Naruhodo podcast data and importing it into Neo4j. The primary objective here is to efficiently collect and structure the data into a graph database, establishing a robust foundation. Future projects will build upon this groundwork to reveal connections between episodes, identify clusters of related themes, and explore how references bridge multiple subjects.

## Table of Contents

- [Introduction](#introduction)
- [Project Structure](#project-structure)
- [Environment and Dependencies](#Environment-and-dependencies)
- [Code Breakdown](#Code-breakdown)
  - [1. Data Scraping Module](#data-scraping-module)
  - [2. Data Collection and CSV Generation](#data-collection-and-csv-generation)
  - [3. CSV Normalization](#csv-normalization)
  - [4. Neo4j Data Import](#neo4j-data-import)
- [Analytical Possibilities in Neo4j](#analytical-possibilities-in-neo4j)
- [Conclusion](#conclusion)


<a name="introduction"></a>
## Introduction

*Naruhodo* is not only a podcast—it’s a curated collection of scientific exploration where episodes often intersect through shared references. **The primary goal of this notebook is to scrape the available Naruhodo podcast data and import it into Neo4j, creating a robust graph database foundation.** Further projects utilizing this dataset will be developed in separate notebooks.

This foundational project opens up a wide range of future possibilities, especially with the integration of LLMs and Machine Learning. Here are the top 5 potential projects that can be pursued once the data is in Neo4j:

1. **Retrieval-Augmented Generation (RAG) for Podcast Summaries:**  
   Combine large language models (LLMs) with data retrieval from Neo4j to generate insightful episode summaries or answer user queries by referencing related content.

2. **RAG-Graph for Thematic Exploration:**  
   Integrate RAG techniques with graph-based search methods to offer context-aware, detailed insights into episodes. This approach can help users navigate complex scientific topics by linking episodes and references seamlessly.

3. **Episode Clusterization and Recommendation Systems:**  
   Apply clustering algorithms on the graph data to identify groups of episodes that share common themes or references. This can power personalized recommendation systems, suggesting episodes similar to those users already enjoy.

4. **Pathway Discovery for Thematic Learning:**  
   Leverage graph analytics to map out learning pathways. For example, if a user is interested in the theme of behavior, the system can highlight a sequential pathway through episodes and references that deepen their understanding of the topic.

5. **Interdisciplinary Knowledge Mapping:**  
   Analyze the intersections of various scientific disciplines across episodes by examining shared references. This can uncover hidden relationships and provide insights into how different fields influence each other.

The following sections explain how the data is scraped, normalized, and imported into Neo4j, setting the stage for these advanced analyses and applications in future projects.


For more details about the podcast and its themes, you can check out [Naruhodo on B9](https://www.b9.com.br/shows/naruhodo/).

<a name="project-structure"></a>
## Project Structure

The repository is organized into the following modules:

- **Environment Configuration:**  
  Stores all sensitive connection details (such as Neo4j credentials and file paths) in a `.env` file using `python-dotenv`. This keeps your configuration secure and separate from the code.

- **Data Scraping Module:**  
  Contains functions that send HTTP requests, parse HTML content, and extract references from individual podcast posts. This module forms the foundation for gathering raw data from the Naruhodo website.

- **Data Collection and CSV Generation:**  
  Iterates over multiple search result pages to collect all podcast post URLs and then scrapes each post for its references. The collected data is saved as a ragged CSV file, where each row contains the episode URL followed by a variable number of reference strings.

- **CSV Normalization:**  
  Converts the ragged CSV into a normalized CSV format. In the normalized file, each row represents a single relationship between an episode and one reference, making the data ideal for graph import and subsequent analysis.

- **Neo4j Data Import:**  
  Loads the normalized CSV file and builds the graph in Neo4j by creating nodes for episodes and references, and establishing `:REFERENCES` relationships between them. This module lays the groundwork for future graph-based analyses and applications.


<a name="Environment-and-dependencies"></a>
## Environment and Dependencies

- **Python 3.x**
- **Dependencies:**
  - `neo4j-driver`
  - `python-dotenv`
  - `pandas` (optional for CSV processing)
  - `csv` (Python’s built-in module)

All sensitive configuration values—such as the Neo4j URI, username, and password, as well as the output CSV path—are stored in a single `.env` file that is excluded from version control.

<a name="Code-breakdown"></a>
## Code Breakdown

<a name="data-scraping-module"></a>
### 1. Data Scraping Module
**`get_soup(url: str) -> BeautifulSoup`**  
  **Purpose:**  
  - Sends a GET request to the given URL using custom headers.
  - Handles HTTP errors and sets the proper encoding.
  - Returns a BeautifulSoup object for HTML parsing.


**`extract_references(post_url: str) -> List[str]`**  
  **Purpose:**  
  - Fetches the HTML content of a podcast post.
  - Locates the “REFERÊNCIAS” section and extracts all subsequent reference texts until a delimiter is encountered.
  - Returns a list of reference strings (or an empty list if no references are found).

In [17]:
# Importing libraries
import random
import time
import requests
from bs4 import BeautifulSoup
from typing import List, Set
import csv

# Base URL of the website to scrape.
BASE_URL: str = 'https://www.b9.com.br'

# Custom headers to mimic a real browser request.
HEADERS: dict[str, str] = {
    'User-Agent': (
        'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
        'AppleWebKit/537.36 (KHTML, like Gecko) '
        'Chrome/90.0.4430.93 Safari/537.36'
    )
}


def get_soup(url: str) -> BeautifulSoup:
    """
    Fetch the content from the given URL and return a BeautifulSoup object
    for parsing the HTML.

    Args:
        url (str): The URL of the webpage to fetch.

    Returns:
        BeautifulSoup: A BeautifulSoup object containing the parsed HTML.

    Raises:
        HTTPError: If the HTTP request fails (non-200 status code).
    """
    # Send a GET request with custom headers.
    response = requests.get(url, headers=HEADERS)
    # Raise an error for bad responses (e.g., 404, 500).
    response.raise_for_status()
    # Set the encoding to UTF-8 to properly interpret the response.
    response.encoding = 'utf-8'
    # Parse and return the HTML content using the built-in parser.
    return BeautifulSoup(response.text, 'html.parser')


def extract_references(post_url: str) -> List[str]:
    """
    Extract a list of reference strings from a post page.

    This function looks for a paragraph element containing the text
    'REFERÊNCIAS'. It then collects the text from all subsequent sibling
    elements until it encounters a sibling with the text '========', which is
    used as a delimiter to mark the end of the references section.

    Args:
        post_url (str): The URL of the post containing references.

    Returns:
        List[str]: A list of reference strings. If no references section is found,
                   an empty list is returned.
    """
    # Retrieve and parse the HTML of the post page.
    soup = get_soup(post_url)
    
    # Locate the paragraph element that contains 'REFERÊNCIAS'.
    references_section = soup.find('p', string=lambda x: x and 'REFERÊNCIAS' in x)
    if not references_section:
        return []
    
    references: List[str] = []
    # Iterate over all sibling elements that follow the references section.
    for sibling in references_section.find_next_siblings():
        text = sibling.get_text(strip=True)
        # Stop collecting references when encountering the delimiter.
        if text == '========':
            break
        references.append(text)
    
    return references


In [18]:
# Example usage:
if __name__ == '__main__':
    # Replace 'your_post_url' with the actual URL you want to scrape.
    your_post_url = 'https://www.b9.com.br/shows/naruhodo/naruhodo-418-o-que-e-a-birra/?highlight=naruhodo'
    refs = extract_references(your_post_url)
    for ref in refs:
        print(ref)
        

Assessment, management, and prevention of childhood temper tantrumshttps://journals.lww.com/jaanp/abstract/2012/10000/assessment,_management,_and_prevention_of.2.aspx
Temper Tantrums in Young Children: 2. Tantrum Duration and Temporal Organizationhttps://journals.lww.com/jrnldbp/fulltext/2003/06000/temper_tantrums_in_young_children__2__tantrum.3.aspx?casa_token=XT0dxgcDQJMAAAAA:KXBH6vF25IZT4vBlzGF3SysfHTm6XlWlcOFuAp_pcIfqXl2s_-yU_6pvKirSKoFbV8Y7jLlaqqq8zdLWV0W4NmaXTw
Temper Tantrums in Young Children: 1. Behavioral Compositionhttps://journals.lww.com/jrnldbp/fulltext/2003/06000/temper_tantrums_in_young_children__1__behavioral.2.aspx?casa_token=86hhrSeXMh0AAAAA:ZEF3NP81tjsathb5NVrGbcc08KdVqBjLNRBGr4pwZAkkRZszvPoUyZuTzdnwyRjirZ_ejI11i9YDHUVa3uNK1EAEOg
Meltdown/Tantrum Detection System for Individuals with Autism Spectrum Disorderhttps://www.tandfonline.com/doi/full/10.1080/08839514.2021.1991115
Developmental pathways from preschool temper tantrums to later psychopathologyhttps://www.camb

<a name="data-collection-and-csv-generation"></a>
### 2. Data Collection and CSV Generation
**`get_podcast_posts(page_number: int) -> List[str]`**  
  **Purpose:**  
  - Constructs the search URL using the page number.
  - Scrapes the page to extract all podcast post URLs by selecting elements with the CSS class `c-post-card__link`.

**`scrape_references() -> List[List[str]]`**   
  **Purpose:**  
  - Iterates through search result pages starting from page 1 until no more post URLs are found.
  - For each post URL, calls `extract_references` to collect the references.
  - Aggregates the data so that each row consists of the post URL followed by its corresponding references.

**`save_to_csv(data: List[List[str]], filename: str = 'references.csv') -> None`**   
  **Purpose:**  
  - Writes the aggregated (ragged) data to a CSV file using UTF-8 encoding.
  - Each row in the CSV starts with the post URL and is followed by the extracted references.


TESTE

In [23]:
import os
import pandas as pd
from dotenv import load_dotenv
from typing import NoReturn, List



# Updated URL format for pages
SEARCH_URL: str = 'https://www.b9.com.br/shows/naruhodo/?pagina={}#anchor-tabs'

def get_podcast_posts(page_number: int) -> List[str]:
    """
    Retrieve podcast post URLs from a search page.

    This function formats the search URL with the provided page number,
    fetches the page content using get_soup, and extracts all post links
    that contain 'naruhodo' in their href using multiple selectors to ensure
    we catch all episodes. Only keeps links from b9.com.br domain.

    Args:
        page_number (int): The page number to scrape.

    Returns:
        List[str]: A list of URLs for the podcast posts found on the page.
    """
    # Format the URL with the given page number and retrieve its parsed content.
    soup = get_soup(SEARCH_URL.format(page_number))
    
    # Find all links using multiple selectors to catch all possible episode links
    all_links = []
    
    # Method 1: Direct link search
    links = soup.find_all('a', href=lambda href: href and 'naruhodo' in href.lower())
    all_links.extend(links)
    
    # Method 2: Search in article titles
    articles = soup.find_all('article')
    for article in articles:
        title_links = article.find_all('a', href=lambda href: href and 'naruhodo' in href.lower())
        all_links.extend(title_links)
    
    # Method 3: Search in post listings
    post_listings = soup.find_all('div', class_='post-listing')
    for listing in post_listings:
        listing_links = listing.find_all('a', href=lambda href: href and 'naruhodo' in href.lower())
        all_links.extend(listing_links)
    
    # Extract unique URLs while preserving order and add base URL if needed
    # Only keep URLs from b9.com.br domain
    seen = set()
    unique_links = []
    for link in all_links:
        href = link['href']
        # Add base URL if the link is relative
        if href.startswith('/'):
            href = f"https://www.b9.com.br{href}"
        
        # Only keep links from b9.com.br domain
        if (href not in seen and 
            'naruhodo' in href.lower() and 
            'b9.com.br' in href.lower() and
            not any(ext in href.lower() for ext in ['podcast.apple', 'facebook', 'twitter', 'spotify', 'youtube'])):
            seen.add(href)
            unique_links.append(href)
    
    print(f"Found {len(unique_links)} unique b9.com.br podcast links on page {page_number}")
    return unique_links

# Updated to iterate from page 1 to 35
def scrape_references() -> List[List[str]]:
    """
    Scrape references from podcast posts across pages 1 to 35.

    Returns:
        List[List[str]]: A list of lists, where each inner list contains a post URL
                         and its corresponding references.
    """
    all_references: List[List[str]] = []
    
    # Iterate from page 1 to 35
    for page in range(21, 36):
        print(f"Scraping page {page} of 35...")
        try:
            post_links = get_podcast_posts(page)
            
            # If no valid Naruhodo episodes are found, log and continue
            if not post_links:
                print(f"No Naruhodo episodes found on page {page}. Continuing to next page.")
                continue
            
            # Process each episode link
            for post_link in post_links:
                print(f"Scraping post {post_link}...")
                try:
                    references = extract_references(post_link)
                    # Prepend the post URL to the list of references.
                    all_references.append([post_link] + references)
                    # Pause for 1-2 seconds to be respectful to the server.
                    time.sleep(random.uniform(1, 2))
                except Exception as e:
                    print(f"Error scraping {post_link}: {str(e)}")
                    continue

        except Exception as e:
            print(f"Error processing page {page}: {str(e)}")
            # Continue to the next page instead of breaking
            continue

    print(f"Scraping completed. Processed {len(all_references)} episodes.")
    return all_references

def save_to_csv(data: List[List[str]], filename: str = 'references3.csv') -> None:
    """
    Save the scraped data to a CSV file with structured columns.
    Each reference will be in its own column (Reference_1, Reference_2, etc.).

    Args:
        data (List[List[str]]): List where each inner list contains [episode_url, reference1, reference2, ...]
        filename (str): Name of the output CSV file
    """
    # Find the maximum number of references in any episode
    max_refs = max(len(row) - 1 for row in data)  # -1 because first element is episode URL
    
    # Create column names
    columns = ['Episode'] + [f'Reference_{i+1}' for i in range(max_refs)]
    
    # Create a list of dictionaries where each dictionary represents a row
    structured_data = []
    for row in data:
        episode_data = {'Episode': row[0]}  # First element is always the episode URL
        
        # Add references to their respective columns
        for i, ref in enumerate(row[1:]):  # Skip the first element (episode URL)
            if ref.strip():  # Only add non-empty references
                episode_data[f'Reference_{i+1}'] = ref.strip()
        
        structured_data.append(episode_data)
    
    # Convert to DataFrame and save to CSV
    df = pd.DataFrame(structured_data, columns=columns)
    
    # Save to CSV, handling missing values properly
    df.to_csv(filename, index=False, encoding='utf-8')
    
    print(f"Saved {len(df)} episodes with up to {max_refs} references each to {filename}")
    print(f"Column names: {', '.join(columns)}")

if __name__ == "__main__":
    # Load environment variables from the .env file (if needed)
    load_dotenv()

In [24]:
    # Scrape references from the website and save them to a CSV file.
    references = scrape_references()
    save_to_csv(references)
    print("Data has been saved to references.csv")

Scraping page 21 of 35...
Found 23 unique b9.com.br podcast links on page 21
Scraping post https://www.b9.com.br/shows/naruhodo/...
Scraping post https://www.b9.com.br/shows/naruhodo/naruhodo-203-especial-premio-ig-nobel-2019-parte-2-de-2/...
Scraping post https://www.b9.com.br/shows/naruhodo/naruhodo-202-especial-premio-ig-nobel-2019-parte-1-de-2/...
Scraping post https://www.b9.com.br/shows/naruhodo/naruhodo-201-por-que-o-nosso-cerebro-as-vezes-falha/...
Scraping post https://www.b9.com.br/shows/naruhodo/naruhodo-200-desafio-naruhodo-quantas-pessoas-restaram-na-cidade-das-sombras/...
Scraping post https://www.b9.com.br/shows/naruhodo/naruhodo-199-existe-instinto-materno-parte-2-de-2/...
Scraping post https://www.b9.com.br/shows/naruhodo/naruhodo-198-existe-instinto-materno-parte-1-de-2/...
Scraping post https://www.b9.com.br/shows/naruhodo/naruhodo-197-adocantes-artificiais-podem-fazer-mal/...
Scraping post https://www.b9.com.br/shows/naruhodo/naruhodo-196-por-que-colecionamos-coisas

<a name="csv-normalization"></a>
### 3. CSV Normalization
**`normalize_references(input_file: str, output_file: str) -> None`**  
  **Purpose:**  
  - Reads the ragged CSV (where each row has an episode followed by a variable number of references).
  - Converts the data into a normalized CSV format with two columns: "Episode" and "Reference".
  - Each row in the normalized CSV represents one episode–reference relationship.

In [25]:
import pandas as pd

# Function to process a single file into long format
def process_file_to_long_format(file_path):
    # Read the CSV file
    df = pd.read_csv(file_path)
    
    # Melt the dataframe to create a long format
    melted_df = pd.melt(
        df,
        id_vars=['Episode'],
        value_vars=[col for col in df.columns if col.startswith('Reference_')],
        var_name='Reference_Number',
        value_name='Reference'
    )
    
    # Clean up the data
    melted_df = melted_df.dropna(subset=['Reference'])
    melted_df = melted_df.drop('Reference_Number', axis=1)
    melted_df = melted_df[melted_df['Reference'].str.strip() != '']
    
    return melted_df

# Process both files
print("Processing references2.csv...")
df2 = process_file_to_long_format('references2.csv')
print(f"Shape of references2.csv after processing: {df2.shape}")

print("\nProcessing references3.csv...")
df3 = process_file_to_long_format('references3.csv')
print(f"Shape of references3.csv after processing: {df3.shape}")

# Combine the dataframes
combined_df = pd.concat([df2, df3], ignore_index=True)

# Remove any duplicates
combined_df = combined_df.drop_duplicates()

# Sort by episode URL
combined_df = combined_df.sort_values('Episode')

# Display information
print("\nFinal combined dataset:")
print("Shape:", combined_df.shape)
print("\nFirst few rows:")
display(combined_df.head(10))

# Save to a new CSV file
combined_df.to_csv('combined_references_long_format.csv', index=False)
print("\nSaved to combined_references_long_format.csv")

# Display some statistics
print("\nNumber of references per episode:")
print(combined_df.groupby('Episode').size().describe())

Processing references2.csv...
Shape of references2.csv after processing: (4846, 2)

Processing references3.csv...
Shape of references3.csv after processing: (781, 2)

Final combined dataset:
Shape: (5606, 2)

First few rows:


Unnamed: 0,Episode,Reference
5459,https://www.b9.com.br/shows/naruhodo/naruhodo-...,==>https://www.b9.com.br/78405/naruhodo-94-o-q...
4942,https://www.b9.com.br/shows/naruhodo/naruhodo-...,Vídeo da Vox – Why the Myers-Briggs test is to...
5220,https://www.b9.com.br/shows/naruhodo/naruhodo-...,==>https://www.b9.com.br/70398/naruhodo-51-ast...
5293,https://www.b9.com.br/shows/naruhodo/naruhodo-...,Naruhodo #90 – O que é inteligência?
5046,https://www.b9.com.br/shows/naruhodo/naruhodo-...,==>https://www.youtube.com/watch?v=Q5pggDCnt5M
5354,https://www.b9.com.br/shows/naruhodo/naruhodo-...,==>https://www.b9.com.br/76934/naruhodo-90-o-q...
5139,https://www.b9.com.br/shows/naruhodo/naruhodo-...,"Naruhodo #51 – Astrologia, horóscopo e mapa as..."
5409,https://www.b9.com.br/shows/naruhodo/naruhodo-...,Naruhodo #94 – O que é o Teorema de Bayes? (E ...
5045,https://www.b9.com.br/shows/naruhodo/naruhodo-...,Podcasts das #Minas: Baseado em Fatos Surreais...
5292,https://www.b9.com.br/shows/naruhodo/naruhodo-...,Artigo: Marketing actions can modulate neural ...



Saved to combined_references_long_format.csv

Number of references per episode:
count    356.000000
mean      15.747191
std        9.839737
min        1.000000
25%        8.000000
50%       16.000000
75%       22.000000
max       56.000000
dtype: float64


<a name="neo4j-data-import"></a>
### 4. Neo4j Data Import
**`load_data(filename: str = "references.csv") -> List[List[str]]`**  
  **Purpose:**  
  - Loads the normalized CSV file and returns the data as a list of rows, where each row is a list of strings.

**`create_graph(tx: Transaction, data: List[List[str]]) -> None`**  
  **Purpose:**  
  - Iterates over each row from the CSV.
  - For each row, creates (or merges) an Episode node (using the episode URL) and a Reference node (using the reference URL).
  - Establishes a `:REFERENCES` relationship between the Episode and Reference nodes via Cypher queries.

**`main() -> None`**  
  **Purpose:**  
  - Orchestrates the Neo4j data import process by loading the CSV data, opening a session, executing the transaction to create the graph, and closing the driver.

In [12]:
pip install neo4j

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.0 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


### Data Cleaning

In [8]:
import pandas as pd
import re
import logging
import os

# Set up logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

DATA_DIR = 'datasets/'

def clean_and_prepare_datasets():
    logger.info("Loading original dataset...")
    df = pd.read_csv(os.path.join(DATA_DIR, 'combined_references_long_format.csv'))
    logger.info(f"Initial dataset size: {len(df)} rows")

    # --- Basic Cleaning ---
    def clean_text(text):
        if not isinstance(text, str):
            return text
        text = re.sub('^==>', '', text.strip())
        text = text.rstrip('/')
        return text

    df['Episode'] = df['Episode'].apply(clean_text)
    df['Reference'] = df['Reference'].apply(clean_text)

    # Remove "Podcast das Minas" references
    df = df[~df['Reference'].str.contains('Podcast das #Minas|Podcasts das #Minas', na=False, regex=True)]
    logger.info(f"Dataset size after removing 'Podcast das Minas': {len(df)} rows")

    # --- Title/URL Separation ---
    def separate_title_url(text):
        if not isinstance(text, str):
            return pd.Series({'title': None, 'url': None})
        urls = re.findall(r'https?://[^\s]+', text)
        if len(urls) > 0:
            url = urls[0]
            title = text.replace(url, '').strip()
            if not title:
                title = extract_title_from_url(url)
            return pd.Series({'title': title, 'url': url})
        else:
            return pd.Series({'title': text, 'url': None})

    def extract_title_from_url(url):
        match = re.search(r'naruhodo-\d+-(.*?)(?:/|$)', url)
        if match:
            return match.group(1).replace('-', ' ').title()
        return None

    reference_parts = df['Reference'].apply(separate_title_url)
    df['reference_title'] = reference_parts['title']
    df['reference_url'] = reference_parts['url']

    episode_parts = df['Episode'].apply(separate_title_url)
    df['episode_title'] = episode_parts['title']
    df['episode_url'] = episode_parts['url']

    # --- Episode Number Extraction ---
    def extract_episode_number(url):
        if not isinstance(url, str):
            return None
        match = re.search(r'naruhodo-(\d+)', url)
        if match:
            return int(match.group(1))
        return None

    df['episode_number'] = df['episode_url'].apply(extract_episode_number)
    df['referenced_episode_number'] = df['reference_url'].apply(extract_episode_number)

    # --- Reference Type Classification ---
    logger.info("Classifying reference types...")
    def classify_reference(url, title):
        if not isinstance(url, str):
            if isinstance(title, str):
                return 9  # Unknown type for text-only references
            return 9  # Unknown type for invalid references
        url_lower = url.lower()
        if 'b9.com.br/shows/naruhodo' in url_lower:
            return 8  # Episode
        if any(domain in url_lower for domain in ['youtube.com', 'vimeo.com', 'youtu.be']):
            return 1  # Video
        if any(domain in url_lower for domain in ['doi.org', 'sciencedirect.com', 'springer.com', 'ncbi.nlm.nih.gov', 'jstor.org', 'academia.edu']):
            return 2  # Scientific Paper
        if any(domain in url_lower for domain in ['bbc.com', 'cnn.com', 'nytimes.com', 'folha.uol.com.br', 'g1.globo.com']):
            return 3  # News Article
        if any(domain in url_lower for domain in ['twitter.com', 'facebook.com', 'instagram.com', 'linkedin.com']):
            return 5  # Social Media
        if '.edu' in url_lower:
            return 6  # Academic Website
        if '.gov' in url_lower:
            return 7  # Government Website
        return 9  # Unknown type

    df['reference_type_id'] = df.apply(lambda x: classify_reference(x['reference_url'], x['reference_title']), axis=1)

    # --- Create Master Episode Table ---
    logger.info("Creating master episode table...")
    naruhodo_episodes = df[df['episode_number'].notna()][
        ['episode_number', 'episode_title', 'episode_url']
    ].drop_duplicates().sort_values('episode_number')

    # --- Create Episode-to-Episode References Table ---
    logger.info("Creating episode-to-episode references table with titles...")
    episode_refs = df[df['reference_type_id'] == 8].copy()
    naruhodo_episodes_references = episode_refs[['episode_number', 'referenced_episode_number']].dropna()
    naruhodo_episodes_references['episode_number'] = naruhodo_episodes_references['episode_number'].astype(int)
    naruhodo_episodes_references['referenced_episode_number'] = naruhodo_episodes_references['referenced_episode_number'].astype(int)

    # Map episode numbers to titles
    ep_num_to_title = naruhodo_episodes.set_index('episode_number')['episode_title'].to_dict()
    naruhodo_episodes_references['source_episode_title'] = naruhodo_episodes_references['episode_number'].map(ep_num_to_title)
    naruhodo_episodes_references['referenced_episode_title'] = naruhodo_episodes_references['referenced_episode_number'].map(ep_num_to_title)
    naruhodo_episodes_references = naruhodo_episodes_references.rename(columns={'episode_number': 'source_episode_number'})

    # Reorder columns
    naruhodo_episodes_references = naruhodo_episodes_references[
        ['source_episode_number', 'source_episode_title', 'referenced_episode_number', 'referenced_episode_title']
    ]

    # --- Create External References Table ---
    logger.info("Creating external references table and removing any Naruhodo episodes...")
    non_episode_refs = df[df['reference_type_id'] != 8].copy()
    naruhodo_references = non_episode_refs[[
        'episode_number', 'episode_title',
        'reference_title', 'reference_url',
        'reference_type_id'
    ]].dropna(subset=['episode_number'])

    # Remove any rows where reference_url contains Naruhodo episode URL (extra safety)
    naruhodo_in_refs_url = naruhodo_references[naruhodo_references['reference_url'].str.contains('b9.com.br/shows/naruhodo', na=False)]
    if not naruhodo_in_refs_url.empty:
        logger.warning(f"Removing {len(naruhodo_in_refs_url)} Naruhodo episodes from external references by URL.")
        naruhodo_references = naruhodo_references[~naruhodo_references['reference_url'].str.contains('b9.com.br/shows/naruhodo', na=False)]
    else:
        logger.info("No Naruhodo episodes found in external references by URL.")

    # Remove any rows where reference_title contains "Naruhodo" (final safety net)
    naruhodo_in_refs_title = naruhodo_references[naruhodo_references['reference_title'].str.contains('Naruhodo', na=False, case=False)]
    if not naruhodo_in_refs_title.empty:
        logger.warning(f"Removing {len(naruhodo_in_refs_title)} Naruhodo episodes from external references by title.")
        naruhodo_references = naruhodo_references[~naruhodo_references['reference_title'].str.contains('Naruhodo', na=False, case=False)]
    else:
        logger.info("No Naruhodo episodes found in external references by title.")

    # --- Save Datasets ---
    naruhodo_episodes.to_csv(os.path.join(DATA_DIR, 'naruhodo_episodes.csv'), index=False)
    naruhodo_episodes_references.to_csv(os.path.join(DATA_DIR, 'naruhodo_episodes_references.csv'), index=False)
    naruhodo_references.to_csv(os.path.join(DATA_DIR, 'naruhodo_references.csv'), index=False)

    logger.info(f"Saved {len(naruhodo_episodes)} episodes, {len(naruhodo_episodes_references)} episode-to-episode references, and {len(naruhodo_references)} external references.")

    # --- Validation ---
    logger.info("Validating datasets...")
    # Check for any Naruhodo episodes in external references
    naruhodo_in_refs_url = naruhodo_references[naruhodo_references['reference_url'].str.contains('b9.com.br/shows/naruhodo', na=False)]
    naruhodo_in_refs_title = naruhodo_references[naruhodo_references['reference_title'].str.contains('Naruhodo', na=False, case=False)]
    if not naruhodo_in_refs_url.empty or not naruhodo_in_refs_title.empty:
        logger.warning(f"Still found {len(naruhodo_in_refs_url) + len(naruhodo_in_refs_title)} Naruhodo episodes in external references after cleaning!")
    else:
        logger.info("No Naruhodo episodes found in external references. Clean separation achieved.")

    # Check for missing episode references
    missing_episodes = set(naruhodo_episodes_references['referenced_episode_number']) - set(naruhodo_episodes['episode_number'])
    if missing_episodes:
        logger.warning(f"Referenced episodes not found in master table: {missing_episodes}")
    else:
        logger.info("All referenced episodes are present in the master table.")

    logger.info("Data cleaning and preparation complete.")

# Run the cleaning/preparation
clean_and_prepare_datasets()

2025-04-18 21:49:14,042 - INFO - Loading original dataset...
2025-04-18 21:49:14,165 - INFO - Initial dataset size: 5585 rows
2025-04-18 21:49:14,209 - INFO - Dataset size after removing 'Podcast das Minas': 5387 rows
2025-04-18 21:49:16,514 - INFO - Classifying reference types...
2025-04-18 21:49:16,644 - INFO - Creating master episode table...
2025-04-18 21:49:16,658 - INFO - Creating episode-to-episode references table with titles...
2025-04-18 21:49:16,669 - INFO - Creating external references table and removing any Naruhodo episodes...
2025-04-18 21:49:16,687 - INFO - No Naruhodo episodes found in external references by URL.
2025-04-18 21:49:17,053 - INFO - Saved 338 episodes, 428 episode-to-episode references, and 3954 external references.
2025-04-18 21:49:17,057 - INFO - Validating datasets...
2025-04-18 21:49:17,071 - INFO - No Naruhodo episodes found in external references. Clean separation achieved.
2025-04-18 21:49:17,077 - INFO - Data cleaning and preparation complete.


### Data Import to Neo4j

In [11]:
import os
import pandas as pd
from neo4j import GraphDatabase
from dotenv import load_dotenv
import logging

# Set up logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

# Load environment variables
load_dotenv()
NEO4J_URI = os.getenv('NEO4J_URI')
NEO4J_USER = os.getenv('NEO4J_USER')
NEO4J_PASSWORD = os.getenv('NEO4J_PASSWORD')
DATA_DIR = 'datasets/'

def connect_to_neo4j(uri, user, password):
    return GraphDatabase.driver(uri, auth=(user, password))

def clear_database(session):
    logger.info("Clearing existing database...")
    # Drop constraints first
    constraints_query = "SHOW CONSTRAINTS"
    constraints = session.run(constraints_query).data()
    for constraint in constraints:
        constraint_name = constraint['name']
        session.run(f"DROP CONSTRAINT {constraint_name} IF EXISTS")
    # Delete all nodes and relationships
    delete_query = "MATCH (n) DETACH DELETE n"
    session.run(delete_query)
    logger.info("Database cleared successfully!")

def create_constraints(session):
    logger.info("Creating constraints...")
    constraints = [
        "CREATE CONSTRAINT episode_number IF NOT EXISTS FOR (e:Episode) REQUIRE e.episode_number IS UNIQUE",
        "CREATE CONSTRAINT reference_url IF NOT EXISTS FOR (r:Reference) REQUIRE r.url IS UNIQUE"
    ]
    for constraint in constraints:
        session.run(constraint)

def create_episodes(session, episodes_df):
    logger.info("Creating Episode nodes...")
    episodes_df['episode_number'] = episodes_df['episode_number'].astype(int)
    query = """
    UNWIND $episodes AS episode
    MERGE (e:Episode {episode_number: episode.episode_number})
    SET e.title = episode.episode_title,
        e.url = episode.episode_url
    """
    episodes_data = episodes_df.to_dict('records')
    session.run(query, episodes=episodes_data)

def create_references(session, references_df):
    logger.info("Creating Reference nodes (external only)...")
    # Assign unique placeholder to missing/empty URLs
    missing_mask = references_df['reference_url'].isna() | (references_df['reference_url'].astype(str).str.strip() == '')
    num_missing = missing_mask.sum()
    if num_missing > 0:
        logger.warning(f"Assigning 'unknown_reference_url' to {num_missing} references with missing/empty URLs.")
        references_df.loc[missing_mask, 'reference_url'] = [
            f"unknown_reference_url_{i}" for i in references_df[missing_mask].index
        ]
    references_df = references_df.drop_duplicates(subset=['reference_url'])
    query = """
    UNWIND $references AS ref
    MERGE (r:Reference {url: ref.reference_url})
    SET r.title = ref.reference_title,
        r.type_id = ref.reference_type_id
    """
    references_data = references_df.to_dict('records')
    session.run(query, references=references_data)

def create_episode_to_episode_relationships(session, episode_refs_df):
    logger.info("Creating direct Episode-to-Episode REFERENCES relationships...")
    episode_refs_df['source_episode_number'] = episode_refs_df['source_episode_number'].astype(int)
    episode_refs_df['referenced_episode_number'] = episode_refs_df['referenced_episode_number'].astype(int)
    query = """
    UNWIND $relationships AS rel
    MATCH (source:Episode {episode_number: rel.source_episode_number})
    MATCH (target:Episode {episode_number: rel.referenced_episode_number})
    MERGE (source)-[:REFERENCES]->(target)
    """
    relationships_data = episode_refs_df.to_dict('records')
    session.run(query, relationships=relationships_data)

def create_episode_to_reference_relationships(session, references_df):
    logger.info("Creating Episode-to-Reference REFERENCES relationships...")
    # Assign unique placeholder to missing/empty URLs (to match Reference nodes)
    missing_mask = references_df['reference_url'].isna() | (references_df['reference_url'].astype(str).str.strip() == '')
    if missing_mask.sum() > 0:
        references_df.loc[missing_mask, 'reference_url'] = [
            f"unknown_reference_url_{i}" for i in references_df[missing_mask].index
        ]
    references_df['episode_number'] = references_df['episode_number'].astype(int)
    query = """
    UNWIND $references AS ref
    MATCH (e:Episode {episode_number: ref.episode_number})
    MATCH (r:Reference {url: ref.reference_url})
    MERGE (e)-[:REFERENCES]->(r)
    """
    references_data = references_df.to_dict('records')
    session.run(query, references=references_data)

def run_validation_queries(session):
    logger.info("Running validation queries...")
    validations = [
        ("Total Episode nodes", "MATCH (e:Episode) RETURN count(e) as count"),
        ("Total Reference nodes", "MATCH (r:Reference) RETURN count(r) as count"),
        ("Total Episode-to-Episode REFERENCES", "MATCH (e1:Episode)-[:REFERENCES]->(e2:Episode) RETURN count(*) as count"),
        ("Total Episode-to-Reference REFERENCES", "MATCH (e:Episode)-[:REFERENCES]->(r:Reference) RETURN count(*) as count"),
        ("Sample Episode-to-Episode", "MATCH (e1:Episode)-[:REFERENCES]->(e2:Episode) RETURN e1.episode_number, e2.episode_number LIMIT 3"),
        ("Sample Episode-to-Reference", "MATCH (e:Episode)-[:REFERENCES]->(r:Reference) RETURN e.episode_number, r.title, r.url LIMIT 3")
    ]
    for description, query in validations:
        result = session.run(query).data()
        logger.info(f"{description}: {result}")

def import_data():
    try:
        # Load datasets
        logger.info("Loading datasets...")
        episodes_df = pd.read_csv(os.path.join(DATA_DIR, 'naruhodo_episodes.csv'))
        episode_refs_df = pd.read_csv(os.path.join(DATA_DIR, 'naruhodo_episodes_references.csv'))
        references_df = pd.read_csv(os.path.join(DATA_DIR, 'naruhodo_references.csv'))

        logger.info(f"Connecting to Neo4j at {NEO4J_URI}")
        driver = connect_to_neo4j(NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD)

        with driver.session() as session:
            confirmation = input("This will delete all existing data in the database. Proceed? (yes/no): ")
            if confirmation.lower() != 'yes':
                logger.info("Import cancelled.")
                return

            clear_database(session)
            create_constraints(session)
            create_episodes(session, episodes_df)
            create_references(session, references_df)
            create_episode_to_episode_relationships(session, episode_refs_df)
            create_episode_to_reference_relationships(session, references_df)
            run_validation_queries(session)
            logger.info("Import completed successfully!")

    except Exception as e:
        logger.error(f"An error occurred: {str(e)}")
    finally:
        if 'driver' in locals():
            driver.close()

if __name__ == "__main__":
    if not (NEO4J_URI and NEO4J_USER and NEO4J_PASSWORD):
        raise ValueError("Missing Neo4j connection details in .env file")
    import_data()

2025-04-18 22:12:42,899 - INFO - Loading datasets...
2025-04-18 22:12:43,064 - INFO - Connecting to Neo4j at bolt://localhost:7687
2025-04-18 22:12:45,351 - INFO - Clearing existing database...
2025-04-18 22:12:47,979 - INFO - Database cleared successfully!
2025-04-18 22:12:47,980 - INFO - Creating constraints...
2025-04-18 22:12:48,388 - INFO - Creating Episode nodes...
2025-04-18 22:12:48,566 - INFO - Creating Reference nodes (external only)...
2025-04-18 22:12:49,219 - INFO - Creating direct Episode-to-Episode REFERENCES relationships...
2025-04-18 22:12:49,641 - INFO - Creating Episode-to-Reference REFERENCES relationships...
2025-04-18 22:12:50,315 - INFO - Running validation queries...
2025-04-18 22:12:50,449 - INFO - Total Episode nodes: [{'count': 338}]
2025-04-18 22:12:50,486 - INFO - Total Reference nodes: [{'count': 3313}]
2025-04-18 22:12:50,549 - INFO - Total Episode-to-Episode REFERENCES: [{'count': 375}]
2025-04-18 22:12:50,597 - INFO - Total Episode-to-Reference REFEREN

### First attempt to import transcripts to Neo4j
It took to long and the embedding seems to not be done.

In [None]:
import os
import re
import logging
from neo4j import GraphDatabase
from dotenv import load_dotenv
from sentence_transformers import SentenceTransformer
import pandas as pd

# --- Setup ---
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

load_dotenv()
NEO4J_URI = os.getenv('NEO4J_URI')
NEO4J_USER = os.getenv('NEO4J_USER')
NEO4J_PASSWORD = os.getenv('NEO4J_PASSWORD')
DATA_DIR = 'datasets/'
TRANSCRIPTS_DIR = 'transcripts/'

# --- Transcript Parsing ---
def parse_transcript_file(file_path):
    with open(file_path, 'r', encoding='utf-8') as f:
        content = f.read()
    # Extract metadata
    meta = {}
    meta_match = re.search(
        r'Title: (.*?)\nURL: (.*?)\nLanguage: (.*?)\nExtracted on: (.*?)\n', content
    )
    if meta_match:
        meta['title'] = meta_match.group(1)
        meta['url'] = meta_match.group(2)
        meta['language'] = meta_match.group(3)
        meta['extracted_on'] = meta_match.group(4)
        ep_num_match = re.search(r'#(\d+)', meta['title'])
        meta['episode_number'] = int(ep_num_match.group(1)) if ep_num_match else None
    else:
        raise ValueError(f"Could not parse transcript metadata in {file_path}.")
    # Extract segments
    segments = []
    for match in re.finditer(r'\[(\d{2}:\d{2})\] (.*?)(?=\n\[|$)', content, re.DOTALL):
        timestamp, text = match.groups()
        text = text.strip()
        if text and not text.lower().startswith('[música'):
            segments.append({
                'timestamp': timestamp,
                'content': text
            })
    logger.info(f"Parsed {len(segments)} segments from transcript {file_path}.")
    return meta, segments

# --- Embedding Model ---
def get_embedding_model():
    logger.info("Loading embedding model...")
    return SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

# --- Neo4j Importer ---
class TranscriptImporter:
    def __init__(self, uri, user, password, embedding_model):
        self.driver = GraphDatabase.driver(uri, auth=(user, password))
        self.embedding_model = embedding_model

    def close(self):
        self.driver.close()

    def create_vector_index(self, session, dim=384):
        try:
            logger.info("Creating vector index for TranscriptSegment nodes (if not exists)...")
            session.run(f"""
                CREATE VECTOR INDEX transcript_segment_embedding IF NOT EXISTS
                FOR (s:TranscriptSegment)
                ON (s.embedding)
                OPTIONS {{
                    indexConfig: {{
                        `vector.dimensions`: {dim},
                        `vector.similarity_function`: 'cosine'
                    }}
                }}
            """)
        except Exception as e:
            logger.warning(f"Could not create vector index: {e}")

    def import_transcript(self, meta, segments):
        with self.driver.session() as session:
            # Ensure Episode node exists
            session.run("""
                MERGE (e:Episode {episode_number: $episode_number})
                SET e.title = $title, e.url = $url
            """, {
                'episode_number': meta['episode_number'],
                'title': meta['title'],
                'url': meta['url']
            })
            # Import segments
            for idx, seg in enumerate(segments):
                seg_id = f"{meta['episode_number']}_{seg['timestamp']}"
                embedding = self.embedding_model.encode(seg['content']).tolist()
                session.run("""
                    MERGE (s:TranscriptSegment {id: $id})
                    SET s.episode_number = $episode_number,
                        s.timestamp = $timestamp,
                        s.content = $content,
                        s.embedding = $embedding
                    WITH s
                    MATCH (e:Episode {episode_number: $episode_number})
                    MERGE (e)-[:HAS_SEGMENT]->(s)
                """, {
                    'id': seg_id,
                    'episode_number': meta['episode_number'],
                    'timestamp': seg['timestamp'],
                    'content': seg['content'],
                    'embedding': embedding
                })
            logger.info(f"Imported {len(segments)} segments for episode {meta['episode_number']}.")
            # (Optional) Create vector index
            self.create_vector_index(session, dim=len(embedding))

# --- Main Batch Import Logic ---
if __name__ == "__main__":
    # Load valid episode numbers from naruhodo_episodes.csv
    episodes_df = pd.read_csv(os.path.join(DATA_DIR, 'naruhodo_episodes.csv'))
    valid_episode_numbers = set(episodes_df['episode_number'].astype(int))
    logger.info(f"Loaded {len(valid_episode_numbers)} valid episode numbers.")

    # Prepare embedding model and importer
    model = get_embedding_model()
    importer = TranscriptImporter(NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD, model)

    # Process all transcript files
    transcript_files = [os.path.join(TRANSCRIPTS_DIR, f) for f in os.listdir(TRANSCRIPTS_DIR) if f.endswith('.txt')]
    imported, skipped = 0, 0

    try:
        for file_path in transcript_files:
            try:
                meta, segments = parse_transcript_file(file_path)
                ep_num = meta.get('episode_number')
                if ep_num in valid_episode_numbers:
                    importer.import_transcript(meta, segments)
                    imported += 1
                else:
                    logger.warning(f"Skipping transcript {file_path}: episode {ep_num} not in episode list.")
                    skipped += 1
            except Exception as e:
                logger.error(f"Error processing {file_path}: {e}")
                skipped += 1
        logger.info(f"Import complete. Imported: {imported}, Skipped: {skipped}")
    finally:
        importer.close()

  from .autonotebook import tqdm as notebook_tqdm
2025-04-18 22:58:26,939 - INFO - Loaded 338 valid episode numbers.
2025-04-18 22:58:26,939 - INFO - Loading embedding model...
2025-04-18 22:58:26,952 - INFO - Use pytorch device_name: cpu
2025-04-18 22:58:26,955 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2
2025-04-18 22:58:32,396 - INFO - Parsed 1408 segments from transcript transcripts/Naruhodo _438 - O termo _macho alfa_ faz sentido_ _UNKh0Zd3h_k_pt.txt.
Batches: 100%|██████████| 1/1 [00:00<00:00,  2.14it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 29.06it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 25.47it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 18.12it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 36.57it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 19.68it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 35.07it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 23.79it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 26.39it/s]
Batc

### Test with single episode

In [1]:
import os
import re
import logging
import time
from neo4j import GraphDatabase
from dotenv import load_dotenv
from sentence_transformers import SentenceTransformer
import pandas as pd

# --- Setup ---
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

load_dotenv()
NEO4J_URI = os.getenv('NEO4J_URI')
NEO4J_USER = os.getenv('NEO4J_USER')
NEO4J_PASSWORD = os.getenv('NEO4J_PASSWORD')
DATA_DIR = 'datasets/'
TRANSCRIPTS_DIR = 'transcripts/'

# --- Transcript Parsing ---
def parse_transcript_file(file_path):
    with open(file_path, 'r', encoding='utf-8') as f:
        content = f.read()
    # Extract metadata
    meta = {}
    meta_match = re.search(
        r'Title: (.*?)\nURL: (.*?)\nLanguage: (.*?)\nExtracted on: (.*?)\n', content
    )
    if meta_match:
        meta['title'] = meta_match.group(1)
        meta['url'] = meta_match.group(2)
        meta['language'] = meta_match.group(3)
        meta['extracted_on'] = meta_match.group(4)
        ep_num_match = re.search(r'#(\d+)', meta['title'])
        meta['episode_number'] = int(ep_num_match.group(1)) if ep_num_match else None
    else:
        raise ValueError(f"Could not parse transcript metadata in {file_path}.")
    # Extract segments
    segments = []
    for match in re.finditer(r'\[(\d{2}:\d{2})\] (.*?)(?=\n\[|$)', content, re.DOTALL):
        timestamp, text = match.groups()
        text = text.strip()
        if text and not text.lower().startswith('[música'):
            segments.append({
                'timestamp': timestamp,
                'content': text
            })
    logger.info(f"Parsed {len(segments)} segments from transcript {file_path}.")
    return meta, segments

# --- Neo4j Batch Importer ---
class BatchTranscriptImporter:
    def __init__(self, uri, user, password):
        self.driver = GraphDatabase.driver(uri, auth=(user, password))

    def close(self):
        self.driver.close()

    def batch_import_segments(self, episode_number, segments, episode_title, episode_url):
        with self.driver.session() as session:
            start = time.time()
            # Ensure Episode node exists
            session.run("""
                MERGE (e:Episode {episode_number: $episode_number})
                SET e.title = $title, e.url = $url
            """, {
                'episode_number': episode_number,
                'title': episode_title,
                'url': episode_url
            })
            # Prepare segment data (no embeddings yet)
            segment_nodes = [
                {
                    'id': f"{episode_number}_{seg['timestamp']}",
                    'episode_number': episode_number,
                    'timestamp': seg['timestamp'],
                    'content': seg['content']
                }
                for seg in segments
            ]
            # Batch create TranscriptSegment nodes
            session.run("""
                UNWIND $segments AS seg
                MERGE (s:TranscriptSegment {id: seg.id})
                SET s.episode_number = seg.episode_number,
                    s.timestamp = seg.timestamp,
                    s.content = seg.content
            """, {'segments': segment_nodes})
            # Batch create HAS_SEGMENT relationships
            session.run("""
                UNWIND $segments AS seg
                MATCH (e:Episode {episode_number: seg.episode_number})
                MATCH (s:TranscriptSegment {id: seg.id})
                MERGE (e)-[:HAS_SEGMENT]->(s)
            """, {'segments': segment_nodes})
            end = time.time()
            logger.info(f"Batched import of {len(segments)} segments for episode {episode_number} in {end - start:.2f} seconds.")
            print(f"Batched import of {len(segments)} segments for episode {episode_number} in {end - start:.2f} seconds.")

    def batch_update_embeddings(self, episode_number, segments, embedding_model):
        with self.driver.session() as session:
            start = time.time()
            # Prepare data for batch update
            segment_updates = []
            for seg in segments:
                seg_id = f"{episode_number}_{seg['timestamp']}"
                embedding = embedding_model.encode(seg['content']).tolist()
                segment_updates.append({'id': seg_id, 'embedding': embedding})
            # Batch update embeddings
            session.run("""
                UNWIND $updates AS upd
                MATCH (s:TranscriptSegment {id: upd.id})
                SET s.embedding = upd.embedding
            """, {'updates': segment_updates})
            end = time.time()
            logger.info(f"Batched embedding update for {len(segments)} segments of episode {episode_number} in {end - start:.2f} seconds.")
            print(f"Batched embedding update for {len(segments)} segments of episode {episode_number} in {end - start:.2f} seconds.")

# --- Main Test for Episode 420 ---
if __name__ == "__main__":
    # Find the transcript file for episode 420
    transcript_file = None
    for fname in os.listdir(TRANSCRIPTS_DIR):
        if fname.endswith('.txt') and '420' in fname:
            transcript_file = os.path.join(TRANSCRIPTS_DIR, fname)
            break
    if not transcript_file:
        logger.error("Transcript file for episode 420 not found.")
        exit(1)

    # Parse transcript
    t0 = time.time()
    meta, segments = parse_transcript_file(transcript_file)
    t1 = time.time()
    logger.info(f"Parsing time: {t1 - t0:.2f} seconds.")
    print(f"Parsing time: {t1 - t0:.2f} seconds.")

    # Batch import segments (no embeddings)
    importer = BatchTranscriptImporter(NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD)
    importer.batch_import_segments(meta['episode_number'], segments, meta['title'], meta['url'])

    # Batch embedding update (separate step)
    t2 = time.time()
    model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
    importer.batch_update_embeddings(meta['episode_number'], segments, model)
    t3 = time.time()
    logger.info(f"Total embedding generation and update time: {t3 - t2:.2f} seconds.")
    print(f"Total embedding generation and update time: {t3 - t2:.2f} seconds.")

    importer.close()

  from .autonotebook import tqdm as notebook_tqdm
2025-04-19 07:31:57,176 - INFO - Parsed 1100 segments from transcript transcripts/Naruhodo _420 - Maconha faz mal_ - Parte 2 de 2_F7wVcGvpoGA_pt.txt.
2025-04-19 07:31:57,176 - INFO - Parsing time: 0.06 seconds.


Parsing time: 0.06 seconds.


2025-04-19 07:32:30,824 - INFO - Batched import of 1100 segments for episode 420 in 33.65 seconds.
2025-04-19 07:32:30,866 - INFO - Use pytorch device_name: cpu
2025-04-19 07:32:30,868 - INFO - Load pretrained SentenceTransformer: sentence-transformers/all-MiniLM-L6-v2


Batched import of 1100 segments for episode 420 in 33.65 seconds.


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.06s/it]
Batches: 100%|██████████| 1/1 [00:00<00:00,  4.00it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 11.71it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 29.41it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 37.14it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 27.78it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 34.64it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 34.40it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 17.82it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 22.39it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 26.48it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 38.65it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 35.84it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 32.00it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 29.03it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 38.76it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 46.30it/s]
Batches: 100%|██████████| 1/1 [00:00<00:00, 38.64it/s]
Batches: 1

Batched embedding update for 1100 segments of episode 420 in 60.05 seconds.
Total embedding generation and update time: 65.33 seconds.


### Test with chunks of 300 tokens with overlap of 100


In [8]:
import os
import re
import time
from dotenv import load_dotenv
from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer
from neo4j import GraphDatabase

# --- Setup ---
load_dotenv()
NEO4J_URI = os.getenv('NEO4J_URI')
NEO4J_USER = os.getenv('NEO4J_USER')
NEO4J_PASSWORD = os.getenv('NEO4J_PASSWORD')
TRANSCRIPTS_DIR = 'transcripts'

MODEL_NAME = 'sentence-transformers/all-MiniLM-L6-v2'
embedding_model = SentenceTransformer(MODEL_NAME)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

CHUNK_SIZE = 300
CHUNK_OVERLAP = 100

# --- Episodes to Skip ---
SKIP_EPISODES = {129, 130, 131, 7, 18, 23, 26, 28, 37, 39, 41, 44, 48, 49, 50, 54, 57, 67, 70, 73, 76, 84, 85, 90, 92, 97, 99, 100, 104, 112}

# --- Transcript Loader ---
def load_transcript(episode_number):
    # Use regex to match the episode number as a complete token after the underscore
    pattern = re.compile(rf"^Naruhodo _{episode_number}\D")
    transcript_files = [f for f in os.listdir(TRANSCRIPTS_DIR) if pattern.match(f) and f.endswith('.txt')]
    if len(transcript_files) != 1:
        raise FileNotFoundError(f"Expected exactly one transcript file for episode {episode_number}, found: {transcript_files}")
    transcript_path = os.path.join(TRANSCRIPTS_DIR, transcript_files[0])
    print(f"Loading transcript for episode {episode_number}: {transcript_path}")  # Debug print
    segments = []
    timestamp_re = re.compile(r"\[(\d{2}):(\d{2})\] ?(.*)")
    with open(transcript_path, encoding='utf-8') as f:
        for line in f:
            line = line.strip()
            match = timestamp_re.match(line)
            if match:
                minutes = int(match.group(1))
                seconds = int(match.group(2))
                text = match.group(3).strip()
                total_seconds = minutes * 60 + seconds
                segments.append((total_seconds, text))
    return segments

# --- Token-based Chunker ---
def chunk_transcript_tokenwise(segments, chunk_size=CHUNK_SIZE, overlap=CHUNK_OVERLAP, tokenizer=tokenizer):
    all_text = " ".join([text for _, text in segments])
    tokens = tokenizer(all_text, return_offsets_mapping=True, add_special_tokens=False)
    input_ids = tokens['input_ids']
    offsets = tokens['offset_mapping']
    chunks = []
    chunk_spans = []
    start = 0
    while start < len(input_ids):
        end = min(start + chunk_size, len(input_ids))
        chunk_ids = input_ids[start:end]
        chunk_start_char = offsets[start][0]
        chunk_end_char = offsets[end-1][1] if end-1 < len(offsets) else offsets[-1][1]
        chunk_text = all_text[chunk_start_char:chunk_end_char].strip()
        chunks.append(chunk_text)
        chunk_spans.append((start, end))
        if end == len(input_ids):
            break
        start += chunk_size - overlap
    print(f"Chunked into {len(chunks)} chunks (size={chunk_size}, overlap={overlap})")
    print("First 3 chunk lengths (tokens):", [e-s for s, e in chunk_spans[:3]])
    return chunks

# --- Embedding ---
def generate_embeddings(text_chunks, batch_size=32):
    return embedding_model.encode(text_chunks, batch_size=batch_size, show_progress_bar=True)

# --- Neo4j Driver ---
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))

def delete_all_transcript_segments():
    with driver.session() as session:
        session.run("""
            MATCH (s:TranscriptSegment)
            DETACH DELETE s
        """)
        session.run("""
            MATCH ()-[r:HAS_SEGMENT]->()
            DELETE r
        """)

def import_to_neo4j(episode_number, text_chunks, embeddings):
    with driver.session() as session:
        session.run(
            """
            MERGE (e:Episode {episode_number: $episode_number})
            """,
            {"episode_number": episode_number}
        )
        segment_nodes = [
            {
                'id': f"{episode_number}_{i}",
                'episode_number': episode_number,
                'chunk_index': i,
                'text': chunk,
                'embedding': embedding.tolist() if hasattr(embedding, 'tolist') else list(embedding)
            }
            for i, (chunk, embedding) in enumerate(zip(text_chunks, embeddings))
        ]
        session.run(
            """
            UNWIND $segments AS seg
            MERGE (s:TranscriptSegment {id: seg.id})
            SET s.episode_number = seg.episode_number,
                s.chunk_index = seg.chunk_index,
                s.text = seg.text,
                s.embedding = seg.embedding
            """,
            {'segments': segment_nodes}
        )
        session.run(
            """
            UNWIND $segments AS seg
            MATCH (e:Episode {episode_number: seg.episode_number})
            MATCH (s:TranscriptSegment {id: seg.id})
            MERGE (e)-[:HAS_SEGMENT]->(s)
            """,
            {'segments': segment_nodes}
        )

# --- Main Loop for Jupyter ---
# 1. Delete all transcript segments before import
print("Deleting all TranscriptSegment nodes and HAS_SEGMENT relationships...")
delete_all_transcript_segments()
print("Done.")

# 2. Import all transcripts except those in SKIP_EPISODES
all_results = []
episode_numbers = []
for fname in os.listdir(TRANSCRIPTS_DIR):
    match = re.search(r'Naruhodo _(\d+)', fname)
    if match:
        ep_num = int(match.group(1))
        if ep_num in SKIP_EPISODES:
            print(f"Skipping episode {ep_num} (in skip list).")
            continue
        if fname.endswith('.txt'):
            episode_numbers.append(ep_num)
episode_numbers = sorted(set(episode_numbers))

for episode_number in episode_numbers:
    timings = {}
    print(f"\nProcessing Episode {episode_number}...")
    t0 = time.time()
    segments = load_transcript(episode_number)
    t1 = time.time()
    timings['load'] = t1 - t0

    chunks = chunk_transcript_tokenwise(segments, chunk_size=CHUNK_SIZE, overlap=CHUNK_OVERLAP, tokenizer=tokenizer)
    t2 = time.time()
    timings['chunk'] = t2 - t1

    embeddings = generate_embeddings(chunks, batch_size=32)
    t3 = time.time()
    timings['embed'] = t3 - t2

    import_to_neo4j(episode_number, chunks, embeddings)
    t4 = time.time()
    timings['import'] = t4 - t3

    timings['total'] = t4 - t0
    all_results.append((episode_number, timings))

    print(f"  Load:   {timings['load']:.2f}s")
    print(f"  Chunk:  {timings['chunk']:.2f}s")
    print(f"  Embed:  {timings['embed']:.2f}s")
    print(f"  Import: {timings['import']:.2f}s")
    print(f"  Total:  {timings['total']:.2f}s")

print("\n=== All Episode Timing Summary ===")
for ep, timings in all_results:
    print(f"Episode {ep}: {timings['total']:.2f}s (Load: {timings['load']:.2f}s, Chunk: {timings['chunk']:.2f}s, Embed: {timings['embed']:.2f}s, Import: {timings['import']:.2f}s)")

driver.close()

Deleting all TranscriptSegment nodes and HAS_SEGMENT relationships...


Token indices sequence length is longer than the specified maximum sequence length for this model (2153 > 512). Running this sequence through the model will result in indexing errors


Done.
Skipping episode 131 (in skip list).
Skipping episode 129 (in skip list).
Skipping episode 112 (in skip list).
Skipping episode 130 (in skip list).
Skipping episode 104 (in skip list).
Skipping episode 100 (in skip list).
Skipping episode 99 (in skip list).
Skipping episode 97 (in skip list).
Skipping episode 92 (in skip list).
Skipping episode 90 (in skip list).
Skipping episode 85 (in skip list).
Skipping episode 84 (in skip list).
Skipping episode 76 (in skip list).
Skipping episode 73 (in skip list).
Skipping episode 70 (in skip list).
Skipping episode 67 (in skip list).
Skipping episode 57 (in skip list).
Skipping episode 54 (in skip list).
Skipping episode 50 (in skip list).
Skipping episode 49 (in skip list).
Skipping episode 48 (in skip list).
Skipping episode 44 (in skip list).
Skipping episode 41 (in skip list).
Skipping episode 39 (in skip list).
Skipping episode 37 (in skip list).
Skipping episode 28 (in skip list).
Skipping episode 26 (in skip list).
Skipping episode

Batches: 100%|██████████| 1/1 [00:00<00:00,  1.16it/s]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  0.88s
  Import: 0.40s
  Total:  1.37s

Processing Episode 2...
Loading transcript for episode 2: transcripts\Naruhodo _2 - Por que o CD tem o tamanho que tem__tn-qFPWzVbg_pt.txt
Chunked into 9 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.31it/s]


  Load:   0.08s
  Chunk:  0.01s
  Embed:  0.77s
  Import: 0.07s
  Total:  0.93s

Processing Episode 3...
Loading transcript for episode 3: transcripts\Naruhodo _3 - A soma de números positivos pode dar_SIXVO5aEaVs_pt.txt
Chunked into 8 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  2.41it/s]


  Load:   0.07s
  Chunk:  0.00s
  Embed:  0.42s
  Import: 0.05s
  Total:  0.54s

Processing Episode 4...
Loading transcript for episode 4: transcripts\Naruhodo _4 - A maior parte dos casos de câncer é _Uhqhq6POjIU_pt.txt
Chunked into 22 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.08s/it]


  Load:   0.04s
  Chunk:  0.02s
  Embed:  1.08s
  Import: 0.22s
  Total:  1.36s

Processing Episode 5...
Loading transcript for episode 5: transcripts\Naruhodo _5 - É possivel recuperar uma noite mal d_E-JN2qagigY_pt.txt
Chunked into 17 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.08it/s]


  Load:   0.08s
  Chunk:  0.01s
  Embed:  0.93s
  Import: 0.13s
  Total:  1.14s

Processing Episode 6...
Loading transcript for episode 6: transcripts\Naruhodo _6 - Mulheres que vivem juntas ovulam ao _sw3W6ugoZrE_pt.txt
Chunked into 16 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.25it/s]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  0.81s
  Import: 0.08s
  Total:  0.96s

Processing Episode 8...
Loading transcript for episode 8: transcripts\Naruhodo _8 - Comentários dos ouvintes_ E a respos_yB__znaIXrM_pt.txt
Chunked into 28 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.46s/it]


  Load:   0.04s
  Chunk:  0.02s
  Embed:  1.46s
  Import: 0.17s
  Total:  1.70s

Processing Episode 9...
Loading transcript for episode 9: transcripts\Naruhodo _9 - Como você resolveria o enigma dos su_yVXmojJaGPM_pt.txt
Chunked into 3 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 283]


Batches: 100%|██████████| 1/1 [00:00<00:00,  5.07it/s]


  Load:   0.05s
  Chunk:  0.00s
  Embed:  0.22s
  Import: 0.07s
  Total:  0.34s

Processing Episode 10...
Loading transcript for episode 10: transcripts\Naruhodo _10 - Mudar de casa rejuvenesce a memória_cN-DQpSH7R8_pt.txt
Chunked into 15 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.35it/s]


  Load:   0.07s
  Chunk:  0.01s
  Embed:  0.75s
  Import: 0.11s
  Total:  0.94s

Processing Episode 11...
Loading transcript for episode 11: transcripts\Naruhodo _11 - Por que os japoneses lêem da direit_PRtp7MjjfUU_pt.txt
Chunked into 9 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  2.17it/s]


  Load:   0.05s
  Chunk:  0.00s
  Embed:  0.47s
  Import: 0.06s
  Total:  0.59s

Processing Episode 12...
Loading transcript for episode 12: transcripts\Naruhodo _12 - Comer e entrar na água faz mal__DAGOcEJHvsw_pt.txt
Chunked into 9 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  2.31it/s]


  Load:   0.04s
  Chunk:  0.00s
  Embed:  0.44s
  Import: 0.07s
  Total:  0.55s

Processing Episode 13...
Loading transcript for episode 13: transcripts\Naruhodo _13 - Como você resolveria este enigma po_w5s4dz3HdZY_pt.txt
Chunked into 25 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.15s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  1.16s
  Import: 0.13s
  Total:  1.35s

Processing Episode 14...
Loading transcript for episode 14: transcripts\Naruhodo _14 - Sonhamos todas as noites_ até quand_SAtAJkvtcxk_pt.txt
Chunked into 14 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.32it/s]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  0.76s
  Import: 0.06s
  Total:  0.89s

Processing Episode 15...
Loading transcript for episode 15: transcripts\Naruhodo _15 - Genes são ativados num corpo depois_0BiAw5zhnUM_pt.txt
Chunked into 15 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.40it/s]


  Load:   0.06s
  Chunk:  1.49s
  Embed:  0.73s
  Import: 0.07s
  Total:  2.36s

Processing Episode 16...
Loading transcript for episode 16: transcripts\Naruhodo _16 - O que é e qual o impacto do Movimen_HgXmutXn3E0_pt.txt
Chunked into 18 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.15it/s]


  Load:   0.05s
  Chunk:  0.01s
  Embed:  0.87s
  Import: 0.10s
  Total:  1.03s

Processing Episode 17...
Loading transcript for episode 17: transcripts\Naruhodo _17 - A matemática pode ajudar a ganhar n_vNVPQWfkVnM_pt.txt
Chunked into 17 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.34it/s]


  Load:   0.05s
  Chunk:  0.01s
  Embed:  0.75s
  Import: 0.10s
  Total:  0.91s

Processing Episode 19...
Loading transcript for episode 19: transcripts\Naruhodo _19 - Como você resolveria o mistério da _wxF4lRCf4FA_pt.txt
Chunked into 29 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.51s/it]


  Load:   0.03s
  Chunk:  0.02s
  Embed:  1.52s
  Import: 0.20s
  Total:  1.78s

Processing Episode 20...
Loading transcript for episode 20: transcripts\Naruhodo _20 - Homeopatia funciona_ segundo a ciên_5RKZbHuFXko_pt.txt
Chunked into 38 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.00s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.01s
  Import: 0.30s
  Total:  2.38s

Processing Episode 21...
Loading transcript for episode 21: transcripts\Naruhodo _21 - Comer placenta faz bem__IuhipdmyA_g_pt.txt
Chunked into 13 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.67it/s]


  Load:   0.05s
  Chunk:  0.00s
  Embed:  0.60s
  Import: 0.06s
  Total:  0.71s

Processing Episode 22...
Loading transcript for episode 22: transcripts\Naruhodo _22 - A evolução pode vir a _despuxar_ os_PwcNzXXC28Y_pt.txt
Chunked into 12 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.90it/s]


  Load:   0.04s
  Chunk:  0.01s
  Embed:  0.53s
  Import: 0.07s
  Total:  0.64s

Processing Episode 24...
Loading transcript for episode 24: transcripts\Naruhodo _24 - Pessoas preguiçosas são mais inteli_PHn45UgHUEo_pt.txt
Chunked into 18 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.25it/s]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  0.80s
  Import: 0.10s
  Total:  0.97s

Processing Episode 25...
Loading transcript for episode 25: transcripts\Naruhodo _25 - Como você resolveria o caso dos 23 _Dza7RnGx36U_pt.txt
Chunked into 21 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.13s/it]


  Load:   0.08s
  Chunk:  0.02s
  Embed:  1.13s
  Import: 0.14s
  Total:  1.37s

Processing Episode 27...
Loading transcript for episode 27: transcripts\Naruhodo _27 - Bebedores de gin tônica são mais pr_l6kkme8WXUA_pt.txt
Chunked into 15 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.46it/s]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  0.69s
  Import: 0.07s
  Total:  0.83s

Processing Episode 29...
Loading transcript for episode 29: transcripts\Naruhodo _29 - O que é e como acontece o déjà vu__MsgpP0CWrZs_pt.txt
Chunked into 29 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.39s/it]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  1.39s
  Import: 0.17s
  Total:  1.63s

Processing Episode 30...
Loading transcript for episode 30: transcripts\Naruhodo _30 - Por que homens ficam carecas e mulh_Zxt96XQGGnc_pt.txt
Chunked into 20 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.07it/s]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  0.94s
  Import: 0.12s
  Total:  1.13s

Processing Episode 31...
Loading transcript for episode 31: transcripts\Naruhodo _31 - Misturar bebidas alcoólicas piora a_tA3OJmEjXMY_pt.txt
Chunked into 12 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.70it/s]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  0.59s
  Import: 0.06s
  Total:  0.72s

Processing Episode 32...
Loading transcript for episode 32: transcripts\Naruhodo _32 - Por que precisamos falar urgentemen_eWZnJbw_33I_pt.txt
Chunked into 33 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:01<00:00,  1.30it/s]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  1.54s
  Import: 0.16s
  Total:  1.77s

Processing Episode 33...
Loading transcript for episode 33: transcripts\Naruhodo _33 - É tinta ou óleo nas pernas que roda_CJKkC33grPM_pt.txt
Chunked into 28 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.37s/it]


  Load:   0.04s
  Chunk:  0.01s
  Embed:  1.37s
  Import: 0.13s
  Total:  1.56s

Processing Episode 34...
Loading transcript for episode 34: transcripts\Naruhodo _34 - Qual a origem da Black Friday__TRsuubVZzyQ_pt.txt
Chunked into 15 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.47it/s]


  Load:   0.05s
  Chunk:  0.01s
  Embed:  0.68s
  Import: 0.09s
  Total:  0.83s

Processing Episode 35...
Loading transcript for episode 35: transcripts\Naruhodo _35 - Pessoas absorvem energia umas das o_LBx_pVwgcCU_pt.txt
Chunked into 23 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.01s/it]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  1.01s
  Import: 0.14s
  Total:  1.21s

Processing Episode 36...
Loading transcript for episode 36: transcripts\Naruhodo _36 - Mal de Alzheimer e envelhecimento d_VLZ6EG-3Zsg_pt.txt
Chunked into 23 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.02it/s]


  Load:   0.03s
  Chunk:  0.02s
  Embed:  0.98s
  Import: 0.12s
  Total:  1.15s

Processing Episode 38...
Loading transcript for episode 38: transcripts\Naruhodo _38 - Beber água em copo pequeno mata a s_mb1cg3N_rU8_pt.txt
Chunked into 18 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.25it/s]


  Load:   0.05s
  Chunk:  0.00s
  Embed:  0.80s
  Import: 0.11s
  Total:  0.96s

Processing Episode 40...
Loading transcript for episode 40: transcripts\Naruhodo _40 - Como você resolveria o enigma do re_hyxfPH3Ky98_pt.txt
Chunked into 28 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.45s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  1.46s
  Import: 0.17s
  Total:  1.71s

Processing Episode 42...
Loading transcript for episode 42: transcripts\Naruhodo _42 - Cirandas infantis giram sempre na m__Defw5d94gI_pt.txt
Chunked into 26 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.36s/it]


  Load:   0.05s
  Chunk:  0.00s
  Embed:  1.38s
  Import: 0.17s
  Total:  1.60s

Processing Episode 43...
Loading transcript for episode 43: transcripts\Naruhodo _43 - Especial_ Rapidinhas do Naruhodo_ -_Plwa2xbQ4Vk_pt.txt
Chunked into 12 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.90it/s]


  Load:   0.03s
  Chunk:  0.02s
  Embed:  0.53s
  Import: 0.07s
  Total:  0.64s

Processing Episode 45...
Loading transcript for episode 45: transcripts\Naruhodo _45 - É possível um motor produzir movime_dE47TRviCVM_pt.txt
Chunked into 38 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:01<00:00,  1.17it/s]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  1.73s
  Import: 0.22s
  Total:  2.02s

Processing Episode 46...
Loading transcript for episode 46: transcripts\Naruhodo _46 - Você consegue resolver o caso da mo_yooHnPTdQR8_pt.txt
Chunked into 29 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.88s/it]


  Load:   0.07s
  Chunk:  0.02s
  Embed:  1.88s
  Import: 0.29s
  Total:  2.25s

Processing Episode 51...
Loading transcript for episode 51: transcripts\Naruhodo _51 - Astrologia_ horóscopo e mapa astral_aTpTJyZ4Fm8_pt.txt
Chunked into 43 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.24s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  2.50s
  Import: 0.25s
  Total:  2.85s

Processing Episode 52...
Loading transcript for episode 52: transcripts\Naruhodo _52 - No bar_ fazer xixi uma primeira vez_WMUrKMHJovc_pt.txt
Chunked into 17 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.25it/s]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  0.81s
  Import: 0.11s
  Total:  0.99s

Processing Episode 53...
Loading transcript for episode 53: transcripts\Naruhodo _53 - Desafio Naruhodo_ O que aconteceu n_bZq4ZknxXqY_pt.txt
Chunked into 18 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:00<00:00,  1.09it/s]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  0.93s
  Import: 0.10s
  Total:  1.10s

Processing Episode 55...
Loading transcript for episode 55: transcripts\Naruhodo _55 - Batata frita é mais benéfica que ba_3uH_S_ECrGg_pt.txt
Chunked into 26 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.43s/it]


  Load:   0.08s
  Chunk:  0.02s
  Embed:  1.44s
  Import: 0.18s
  Total:  1.72s

Processing Episode 56...
Loading transcript for episode 56: transcripts\Naruhodo _56 - Especial_ Rapidinhas do Naruhodo_ -_KBvdwK-LwCE_pt.txt
Chunked into 31 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.78s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  1.78s
  Import: 0.19s
  Total:  2.05s

Processing Episode 58...
Loading transcript for episode 58: transcripts\Naruhodo _58 - Por que os horários dos remédios de_v83mD8poBwk_pt.txt
Chunked into 33 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:01<00:00,  1.18it/s]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  1.69s
  Import: 0.20s
  Total:  1.95s

Processing Episode 59...
Loading transcript for episode 59: transcripts\Naruhodo _59 - Roupa com proteção solar funciona__hsiiLsFf2To_pt.txt
Chunked into 20 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.04s/it]


  Load:   0.04s
  Chunk:  0.01s
  Embed:  1.05s
  Import: 0.17s
  Total:  1.28s

Processing Episode 60...
Loading transcript for episode 60: transcripts\Naruhodo _60 - Desafio Naruhodo_ Você sabe resolve_OJWR9nuIwRA_pt.txt
Chunked into 49 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.41s/it]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  2.83s
  Import: 0.27s
  Total:  3.16s

Processing Episode 61...
Loading transcript for episode 61: transcripts\Naruhodo _61 - Pessoas ricas prestam menos atenção_b9XqhOg-19E_pt.txt
Chunked into 43 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.10s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.21s
  Import: 0.23s
  Total:  2.51s

Processing Episode 62...
Loading transcript for episode 62: transcripts\Naruhodo _62 - Existem doenças psicossomáticas__etuFYdCAKe4_pt.txt
Chunked into 43 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.15s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  2.30s
  Import: 0.34s
  Total:  2.72s

Processing Episode 63...
Loading transcript for episode 63: transcripts\Naruhodo _63 - A frequência cardíaca da mãe influe_nBj6D__765Q_pt.txt
Chunked into 26 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.58s/it]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  1.58s
  Import: 0.29s
  Total:  1.94s

Processing Episode 64...
Loading transcript for episode 64: transcripts\Naruhodo _64 - Salário maior significa mais felici_0t4c3kR1WEw_pt.txt
Chunked into 25 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.47s/it]


  Load:   0.22s
  Chunk:  0.02s
  Embed:  1.47s
  Import: 0.14s
  Total:  1.84s

Processing Episode 65...
Loading transcript for episode 65: transcripts\Naruhodo _65 - Existe um microscópio que consegue __tjs-qUS7uI_pt.txt
Chunked into 25 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.35s/it]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  1.35s
  Import: 0.18s
  Total:  1.60s

Processing Episode 66...
Loading transcript for episode 66: transcripts\Naruhodo _66 - Um ano de cachorro equivale a sete _I8ho2KVuCnM_pt.txt
Chunked into 26 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.55s/it]


  Load:   0.05s
  Chunk:  0.01s
  Embed:  1.56s
  Import: 0.16s
  Total:  1.78s

Processing Episode 68...
Loading transcript for episode 68: transcripts\Naruhodo _68 - Arroz cozido pode trazer riscos à s_gpl5Eh0BFz4_pt.txt
Chunked into 33 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:01<00:00,  1.07it/s]


  Load:   0.05s
  Chunk:  0.00s
  Embed:  1.89s
  Import: 0.24s
  Total:  2.19s

Processing Episode 69...
Loading transcript for episode 69: transcripts\Naruhodo _69 - Desafio Naruhodo_ Como resolver o p_DWQNFLdtW4I_pt.txt
Chunked into 50 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.47s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  2.96s
  Import: 0.39s
  Total:  3.44s

Processing Episode 71...
Loading transcript for episode 71: transcripts\Naruhodo _71 - Especial_ Rapidinhas do Naruhodo_ -_ygiJcBB3jaI_pt.txt
Chunked into 37 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.06s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  2.13s
  Import: 0.26s
  Total:  2.46s

Processing Episode 74...
Loading transcript for episode 74: transcripts\Naruhodo _74 - Por que algumas músicas nos arrepia_s9vsB2a8G2E_pt.txt
Chunked into 37 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.10s/it]


  Load:   0.07s
  Chunk:  0.02s
  Embed:  2.22s
  Import: 0.25s
  Total:  2.55s

Processing Episode 75...
Loading transcript for episode 75: transcripts\Naruhodo _75 - Cada pessoa tem um cheiro diferente_whOteWgc5Oo_pt.txt
Chunked into 44 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.23s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  2.47s
  Import: 0.30s
  Total:  2.85s

Processing Episode 77...
Loading transcript for episode 77: transcripts\Naruhodo _77 - Segurar espirro pode matar__5m5P73bhIHo_pt.txt
Chunked into 24 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.46s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  1.46s
  Import: 0.18s
  Total:  1.72s

Processing Episode 78...
Loading transcript for episode 78: transcripts\Naruhodo _78 - Como funciona a memória__k6534PctplQ_pt.txt
Chunked into 68 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.16s/it]


  Load:   0.04s
  Chunk:  0.04s
  Embed:  3.49s
  Import: 0.36s
  Total:  3.93s

Processing Episode 79...
Loading transcript for episode 79: transcripts\Naruhodo _79 - Especial_ Rapidinhas do Naruhodo_ -_fWkdPp4fPpU_pt.txt
Chunked into 36 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:01<00:00,  1.01it/s]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  1.98s
  Import: 0.25s
  Total:  2.30s

Processing Episode 80...
Loading transcript for episode 80: transcripts\Naruhodo _80 - Desafio Naruhodo_ Qual a pergunta d_wjnun_vqUbg_pt.txt
Chunked into 32 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.73s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  1.73s
  Import: 0.22s
  Total:  2.01s

Processing Episode 81...
Loading transcript for episode 81: transcripts\Naruhodo _81 - Por que ficamos com cabelos brancos_tuQy_DwUTlw_pt.txt
Chunked into 41 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.05s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  2.10s
  Import: 0.27s
  Total:  2.45s

Processing Episode 82...
Loading transcript for episode 82: transcripts\Naruhodo _82 - A mu_sica do Ed Sheeran ajuda a ven_H31wcJ0b_bU_pt.txt
Chunked into 36 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.04s/it]


  Load:   0.05s
  Chunk:  0.01s
  Embed:  2.08s
  Import: 0.29s
  Total:  2.44s

Processing Episode 83...
Loading transcript for episode 83: transcripts\Naruhodo _83 - O que são sonhos__rKvDGxCg7XE_pt.txt
Chunked into 62 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:04<00:00,  2.08s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  4.17s
  Import: 0.60s
  Total:  4.87s

Processing Episode 86...
Loading transcript for episode 86: transcripts\Naruhodo _86 - Julgamos homens com barba de um jei_Mqn6iUZ4Qb0_pt.txt
Chunked into 50 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.55s/it]


  Load:   0.08s
  Chunk:  0.04s
  Embed:  3.11s
  Import: 0.31s
  Total:  3.54s

Processing Episode 87...
Loading transcript for episode 87: transcripts\Naruhodo _87 - Desafio Naruhodo_ Qual a resposta p_Z32KqgpwQyM_pt.txt
Chunked into 32 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.88s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  1.88s
  Import: 0.24s
  Total:  2.21s

Processing Episode 88...
Loading transcript for episode 88: transcripts\Naruhodo _88 - Estamos próximos de uma cura para a_pZ7jc-wwoLM_pt.txt
Chunked into 64 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.80s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  3.60s
  Import: 0.39s
  Total:  4.10s

Processing Episode 89...
Loading transcript for episode 89: transcripts\Naruhodo _89 - É possível mudar a forma do arroz c_5m_mgkgPG6g_pt.txt
Chunked into 49 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.26s/it]


  Load:   0.04s
  Chunk:  0.01s
  Embed:  2.54s
  Import: 0.33s
  Total:  2.92s

Processing Episode 91...
Loading transcript for episode 91: transcripts\Naruhodo _91 - Especial_ Rapidinhas do Naruhodo_ -_XLENCmlju9o_pt.txt
Chunked into 42 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.15s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.29s
  Import: 0.27s
  Total:  2.63s

Processing Episode 93...
Loading transcript for episode 93: transcripts\Naruhodo _93 - Ejacular com frequência reduz o ris_aNSmiJLxyhQ_pt.txt
Chunked into 42 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.08s/it]


  Load:   0.10s
  Chunk:  0.02s
  Embed:  2.18s
  Import: 0.36s
  Total:  2.66s

Processing Episode 94...
Loading transcript for episode 94: transcripts\Naruhodo _94 - O que é o Teorema de Bayes_ _E o qu_BE5fpsfPerw_pt.txt
Chunked into 47 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.34s/it]


  Load:   0.08s
  Chunk:  0.03s
  Embed:  2.69s
  Import: 0.35s
  Total:  3.16s

Processing Episode 95...
Loading transcript for episode 95: transcripts\Naruhodo _95 - Pessoas bonitas são privilegiadas__jjJTy3vlXOA_pt.txt
Chunked into 57 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.58s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  3.17s
  Import: 0.35s
  Total:  3.58s

Processing Episode 96...
Loading transcript for episode 96: transcripts\Naruhodo _96 - Desafio Naruhodo_ Quem está mentind_DKn1g4HJd2s_pt.txt
Chunked into 38 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:01<00:00,  1.01it/s]


  Load:   0.07s
  Chunk:  0.02s
  Embed:  1.99s
  Import: 0.30s
  Total:  2.37s

Processing Episode 98...
Loading transcript for episode 98: transcripts\Naruhodo _98 - Por que precisamos falar sobre suic_Yow-FP77YHY_pt.txt
Chunked into 75 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.28s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  3.85s
  Import: 0.46s
  Total:  4.39s

Processing Episode 101...
Loading transcript for episode 101: transcripts\Naruhodo _101 - Plantas são inteligentes__K3lvPcx1w6I_pt.txt
Chunked into 35 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.02s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  2.03s
  Import: 0.29s
  Total:  2.40s

Processing Episode 102...
Loading transcript for episode 102: transcripts\Naruhodo _102 - O que acontece com uma formiga que_MzPnhGBkPcY_pt.txt
Chunked into 44 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.35s/it]


  Load:   0.08s
  Chunk:  0.02s
  Embed:  2.71s
  Import: 0.25s
  Total:  3.06s

Processing Episode 103...
Loading transcript for episode 103: transcripts\Naruhodo _103 - Testes de personalidade funcionam__uZl_y6N6hHA_pt.txt
Chunked into 58 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.68s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  3.37s
  Import: 0.40s
  Total:  3.85s

Processing Episode 105...
Loading transcript for episode 105: transcripts\Naruhodo _105 - Por que criamos bolhas no corpo__71lZaPnNQVo_pt.txt
Chunked into 44 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.24s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.52s
  Import: 0.32s
  Total:  2.90s

Processing Episode 106...
Loading transcript for episode 106: transcripts\Naruhodo _106 - Hipnose funciona_ - Parte 1 de 2_XIHUilfmAzQ_pt.txt
Chunked into 41 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.06s/it]


  Load:   0.04s
  Chunk:  0.01s
  Embed:  2.11s
  Import: 0.31s
  Total:  2.47s

Processing Episode 107...
Loading transcript for episode 107: transcripts\Naruhodo _107 - Hipnose funciona_ - Parte 2 de 2_O42Yo2zt5vs_pt.txt
Chunked into 49 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.27s/it]


  Load:   0.03s
  Chunk:  0.03s
  Embed:  2.54s
  Import: 0.37s
  Total:  2.97s

Processing Episode 108...
Loading transcript for episode 108: transcripts\Naruhodo _108 - Bebida alcoólica ajuda a falar mel_GPNIUjgqHPo_pt.txt
Chunked into 45 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.18s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  2.37s
  Import: 0.29s
  Total:  2.74s

Processing Episode 109...
Loading transcript for episode 109: transcripts\Naruhodo _109 - O cérebro humano é um computador__s3RCSFfV-OQ_pt.txt
Chunked into 53 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.37s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  2.75s
  Import: 0.42s
  Total:  3.27s

Processing Episode 110...
Loading transcript for episode 110: transcripts\Naruhodo _110 - Por que a volta parece mais rápida_S10zT7DPN7Y_pt.txt
Chunked into 49 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.41s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.82s
  Import: 0.45s
  Total:  3.33s

Processing Episode 111...
Loading transcript for episode 111: transcripts\Naruhodo _111 - Desafio Naruhodo_ Como você resolv_V708LnER8Sw_pt.txt
Chunked into 35 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:01<00:00,  1.06it/s]


  Load:   0.04s
  Chunk:  0.02s
  Embed:  1.90s
  Import: 0.26s
  Total:  2.22s

Processing Episode 113...
Loading transcript for episode 113: transcripts\Naruhodo _113 - Por que as pessoas são destras ou _spZjtr9FOmk_pt.txt
Chunked into 59 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.64s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  3.29s
  Import: 0.50s
  Total:  3.87s

Processing Episode 114...
Loading transcript for episode 114: transcripts\Naruhodo _114 - Por que pássaros não têm medo de a_B-9hd04pvgw_pt.txt
Chunked into 45 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.15s/it]


  Load:   0.06s
  Chunk:  0.01s
  Embed:  2.30s
  Import: 0.28s
  Total:  2.65s

Processing Episode 115...
Loading transcript for episode 115: transcripts\Naruhodo _115 - Por que precisamos falar sobre a f_OFRSybNDyq0_pt.txt
Chunked into 66 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.18s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  3.56s
  Import: 0.43s
  Total:  4.07s

Processing Episode 116...
Loading transcript for episode 116: transcripts\Naruhodo _116 - Razão e emoção estão em lados dife_AKKk4R5f91g_pt.txt
Chunked into 46 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.15s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.30s
  Import: 0.30s
  Total:  2.67s

Processing Episode 117...
Loading transcript for episode 117: transcripts\Naruhodo _117 - Por que grávidas têm desejos de co_luAQxkNR3Uw_pt.txt
Chunked into 44 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.13s/it]


  Load:   0.08s
  Chunk:  0.02s
  Embed:  2.25s
  Import: 0.36s
  Total:  2.72s

Processing Episode 118...
Loading transcript for episode 118: transcripts\Naruhodo _118 - Como se prevê a probabilidade de c_PUTwXjsbCd4_pt.txt
Chunked into 54 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.71s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.42s
  Import: 0.55s
  Total:  4.07s

Processing Episode 119...
Loading transcript for episode 119: transcripts\Naruhodo _119 - Estralar os dedos faz mal_ E outra_SrvPUApfAME_pt.txt
Chunked into 48 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.45s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  2.90s
  Import: 0.43s
  Total:  3.43s

Processing Episode 120...
Loading transcript for episode 120: transcripts\Naruhodo _120 - Comer iodo durante a gravidez aume_bqG-nTv-OHo_pt.txt
Chunked into 36 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.15s/it]


  Load:   0.07s
  Chunk:  0.00s
  Embed:  2.31s
  Import: 0.32s
  Total:  2.70s

Processing Episode 121...
Loading transcript for episode 121: transcripts\Naruhodo _121 - É mito ou verdade que usamos apena_FJKEOvMILG4_pt.txt
Chunked into 40 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.09s/it]


  Load:   0.08s
  Chunk:  0.02s
  Embed:  2.18s
  Import: 0.29s
  Total:  2.57s

Processing Episode 122...
Loading transcript for episode 122: transcripts\Naruhodo _122 - Qual a importância que o estudo de_zrwNXhjGgCo_pt.txt
Chunked into 35 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:01<00:00,  1.04it/s]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  1.93s
  Import: 0.24s
  Total:  2.25s

Processing Episode 123...
Loading transcript for episode 123: transcripts\Naruhodo _123 - O que é e como funciona o sonho lú_ThUlmkFFr1U_pt.txt
Chunked into 59 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.67s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  3.33s
  Import: 0.43s
  Total:  3.85s

Processing Episode 124...
Loading transcript for episode 124: transcripts\Naruhodo _124 - Reiki funciona segundo a ciência__Bh-YPbJ_dA0_pt.txt
Chunked into 25 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.65s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  1.65s
  Import: 0.22s
  Total:  1.94s

Processing Episode 125...
Loading transcript for episode 125: transcripts\Naruhodo _125 - Por que algumas pessoas passam mal_WS7GWadZ9lI_pt.txt
Chunked into 40 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.30s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.60s
  Import: 0.42s
  Total:  3.09s

Processing Episode 126...
Loading transcript for episode 126: transcripts\Naruhodo _126 - Desafio Naruhodo_ Qual a resposta _SU2Y94Cz4b4_pt.txt
Chunked into 30 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 1/1 [00:01<00:00,  1.78s/it]


  Load:   0.04s
  Chunk:  0.02s
  Embed:  1.78s
  Import: 0.25s
  Total:  2.08s

Processing Episode 127...
Loading transcript for episode 127: transcripts\Naruhodo _127 - Por que temos pintas em nosso corp_FSreBIJsbzs_pt.txt
Chunked into 47 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.17s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  2.35s
  Import: 0.33s
  Total:  2.75s

Processing Episode 128...
Loading transcript for episode 128: transcripts\Naruhodo _128 - É possível matar uma bactéria com _IwVhInWXrB0_pt.txt
Chunked into 66 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.19s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  3.58s
  Import: 0.40s
  Total:  4.05s

Processing Episode 132...
Loading transcript for episode 132: transcripts\Naruhodo _132 - De onde surgiu o conceito de nomes_oVZB1obaeYA_pt.txt
Chunked into 51 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.37s/it]


  Load:   0.03s
  Chunk:  0.03s
  Embed:  2.73s
  Import: 0.42s
  Total:  3.22s

Processing Episode 133...
Loading transcript for episode 133: transcripts\Naruhodo _133 - Os seres humanos podem ser multita_NDRw7aIkSnU_pt.txt
Chunked into 43 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.14s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  2.28s
  Import: 0.38s
  Total:  2.74s

Processing Episode 134...
Loading transcript for episode 134: transcripts\Naruhodo _134 - Bebida alcoólica aumenta a longevi_r0mvpl1hLgc_pt.txt
Chunked into 43 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.15s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.30s
  Import: 0.48s
  Total:  2.85s

Processing Episode 135...
Loading transcript for episode 135: transcripts\Naruhodo _135 - Como eu sei que você é você e não _Fq-VjuiTOY0_pt.txt
Chunked into 48 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.37s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  2.74s
  Import: 0.41s
  Total:  3.25s

Processing Episode 136...
Loading transcript for episode 136: transcripts\Naruhodo _136 - Como eu sei que você é você e não _yRZkLKL6QH0_pt.txt
Chunked into 57 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.60s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  3.21s
  Import: 0.48s
  Total:  3.78s

Processing Episode 137...
Loading transcript for episode 137: transcripts\Naruhodo _137 - O experimento da prisão de Stanfor_pyTbX9jmMWM_pt.txt
Chunked into 53 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.37s/it]


  Load:   0.03s
  Chunk:  0.02s
  Embed:  2.76s
  Import: 0.49s
  Total:  3.30s

Processing Episode 138...
Loading transcript for episode 138: transcripts\Naruhodo _138 - O que é bruxismo do sono__CFVyaXRNs0Q_pt.txt
Chunked into 33 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:01<00:00,  1.17it/s]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  1.72s
  Import: 0.25s
  Total:  2.04s

Processing Episode 139...
Loading transcript for episode 139: transcripts\Naruhodo _139 - Por que crianças ricas vão melhor _w1uiXbZzsOM_pt.txt
Chunked into 60 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.65s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  3.31s
  Import: 0.44s
  Total:  3.82s

Processing Episode 140...
Loading transcript for episode 140: transcripts\Naruhodo _140 - Por que expressamos tanta raiva na_80dlKHZGUPs_pt.txt
Chunked into 60 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.53s/it]


  Load:   0.04s
  Chunk:  0.03s
  Embed:  3.07s
  Import: 0.57s
  Total:  3.71s

Processing Episode 141...
Loading transcript for episode 141: transcripts\Naruhodo _141 - Cheirar pum faz bem a saúde__ISe5ObqFjT0_pt.txt
Chunked into 37 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.12s/it]


  Load:   0.07s
  Chunk:  0.02s
  Embed:  2.25s
  Import: 0.32s
  Total:  2.66s

Processing Episode 142...
Loading transcript for episode 142: transcripts\Naruhodo _142 - Constelação familiar é uma ideia c_Yzh4Xl5U3Rs_pt.txt
Chunked into 38 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:01<00:00,  1.03it/s]


  Load:   0.04s
  Chunk:  0.02s
  Embed:  1.94s
  Import: 0.40s
  Total:  2.40s

Processing Episode 143...
Loading transcript for episode 143: transcripts\Naruhodo _143 - Constelação familiar é uma ideia c_zGY2wdSz-2U_pt.txt
Chunked into 50 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.36s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.74s
  Import: 0.39s
  Total:  3.20s

Processing Episode 144...
Loading transcript for episode 144: transcripts\Naruhodo _144 - Por que sentimos cócegas__8EPXNzzj-Ww_pt.txt
Chunked into 50 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.30s/it]


  Load:   0.04s
  Chunk:  0.01s
  Embed:  2.60s
  Import: 0.36s
  Total:  3.02s

Processing Episode 145...
Loading transcript for episode 145: transcripts\Naruhodo _145 - Por que a cabeça dói quando tomamo_qjq2Ds6YB-c_pt.txt
Chunked into 48 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.30s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  2.60s
  Import: 0.41s
  Total:  3.10s

Processing Episode 146...
Loading transcript for episode 146: transcripts\Naruhodo _146 - Por que precisamos falar sobre vac_kH1gDdrDcPg_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.52s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  4.59s
  Import: 0.82s
  Total:  5.50s

Processing Episode 147...
Loading transcript for episode 147: transcripts\Naruhodo _147 - Comer pimenta faz bem pra saúde__hSBQpaayjpw_pt.txt
Chunked into 47 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.36s/it]


  Load:   0.08s
  Chunk:  0.02s
  Embed:  2.72s
  Import: 0.34s
  Total:  3.15s

Processing Episode 148...
Loading transcript for episode 148: transcripts\Naruhodo _148 - A dieta low carb reduz a expectati_h53-rSxcRRw_pt.txt
Chunked into 56 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.57s/it]


  Load:   0.05s
  Chunk:  0.01s
  Embed:  3.15s
  Import: 0.52s
  Total:  3.73s

Processing Episode 149...
Loading transcript for episode 149: transcripts\Naruhodo _149 - Por que damos risadas__2DG222emBS4_pt.txt
Chunked into 72 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.35s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  4.08s
  Import: 0.59s
  Total:  4.75s

Processing Episode 150...
Loading transcript for episode 150: transcripts\Naruhodo _150 - O que é o _No Fap September___8yWTngyTq1g_pt.txt
Chunked into 57 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.53s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  3.09s
  Import: 0.47s
  Total:  3.66s

Processing Episode 151...
Loading transcript for episode 151: transcripts\Naruhodo _151 - Especial Prêmio Ig Nobel 2018 - Pa_ZWwaVQ1pWzQ_pt.txt
Chunked into 36 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.02s/it]


  Load:   0.05s
  Chunk:  0.01s
  Embed:  2.03s
  Import: 0.38s
  Total:  2.48s

Processing Episode 152...
Loading transcript for episode 152: transcripts\Naruhodo _152 - Especial Prêmio Ig Nobel 2018 Part_gwk4s2Ut6XI_pt.txt
Chunked into 47 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.29s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.58s
  Import: 0.52s
  Total:  3.18s

Processing Episode 153...
Loading transcript for episode 153: transcripts\Naruhodo _153 - Sonambulismo tem cura__ghcxHlIK5RI_pt.txt
Chunked into 66 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.24s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  3.74s
  Import: 0.48s
  Total:  4.32s

Processing Episode 154...
Loading transcript for episode 154: transcripts\Naruhodo _154 - O que é a Lei de Benford__rmCxIP3YpmQ_pt.txt
Chunked into 56 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.59s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.18s
  Import: 0.47s
  Total:  3.75s

Processing Episode 155...
Loading transcript for episode 155: transcripts\Naruhodo _155 - Tomar decisões cansa o nosso céreb_tqEfVCT4dGo_pt.txt
Chunked into 60 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.56s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  3.14s
  Import: 0.49s
  Total:  3.70s

Processing Episode 156...
Loading transcript for episode 156: transcripts\Naruhodo _156 - O que é paralisia do sono__h9om8soj_uA_pt.txt
Chunked into 53 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.41s/it]


  Load:   0.04s
  Chunk:  0.03s
  Embed:  2.82s
  Import: 0.45s
  Total:  3.34s

Processing Episode 157...
Loading transcript for episode 157: transcripts\Naruhodo _157 - Atravessar portas faz a gente se e_5rTF6SlCo2Y_pt.txt
Chunked into 46 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.15s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.29s
  Import: 0.50s
  Total:  2.85s

Processing Episode 158...
Loading transcript for episode 158: transcripts\Naruhodo _158 - Desafio Naruhodo_ Quem será devora_H8pQL3qLGSs_pt.txt
Chunked into 37 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.23s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  2.47s
  Import: 0.52s
  Total:  3.08s

Processing Episode 159...
Loading transcript for episode 159: transcripts\Naruhodo _159 - A ciência consegue explicar tudo_ _7XTyCpohfy4_pt.txt
Chunked into 60 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.61s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.24s
  Import: 0.60s
  Total:  3.94s

Processing Episode 160...
Loading transcript for episode 160: transcripts\Naruhodo _160 - A ciência consegue explicar tudo_ _NBPr2vIAjQE_pt.txt
Chunked into 66 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.23s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  3.71s
  Import: 0.62s
  Total:  4.44s

Processing Episode 161...
Loading transcript for episode 161: transcripts\Naruhodo _161 - Visitar museus pode curar doenças__5B6YE_WT5dQ_pt.txt
Chunked into 56 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.54s/it]


  Load:   0.04s
  Chunk:  0.03s
  Embed:  3.08s
  Import: 0.61s
  Total:  3.77s

Processing Episode 162...
Loading transcript for episode 162: transcripts\Naruhodo _162 - Por que acontece o nocaute___UmiDEjZmfc_pt.txt
Chunked into 53 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.33s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  2.70s
  Import: 0.53s
  Total:  3.31s

Processing Episode 163...
Loading transcript for episode 163: transcripts\Naruhodo _163 - O que a anestesia desliga no nosso_6IoMagNybTI_pt.txt
Chunked into 45 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.47s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  2.95s
  Import: 0.57s
  Total:  3.62s

Processing Episode 164...
Loading transcript for episode 164: transcripts\Naruhodo _164 - Podemos ler emoções com base em ex_cq4oeBZ5kgo_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.40s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.22s
  Import: 0.60s
  Total:  4.93s

Processing Episode 165...
Loading transcript for episode 165: transcripts\Naruhodo _165 - Quando tomo antidepressivos contin_dWyfUyHUiA4_pt.txt
Chunked into 62 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.71s/it]


  Load:   0.07s
  Chunk:  0.02s
  Embed:  3.44s
  Import: 0.48s
  Total:  4.00s

Processing Episode 166...
Loading transcript for episode 166: transcripts\Naruhodo _166 - Por que as pessoas são desastradas_Soi6smLa-JE_pt.txt
Chunked into 46 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.15s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.31s
  Import: 0.46s
  Total:  2.83s

Processing Episode 167...
Loading transcript for episode 167: transcripts\Naruhodo _167 - Cocô de gato torna as pessoas mais_vOJKAzpT9Gw_pt.txt
Chunked into 49 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.44s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.90s
  Import: 0.48s
  Total:  3.45s

Processing Episode 168...
Loading transcript for episode 168: transcripts\Naruhodo _168 - Japonês é tudo igual__tu1s3JuB_Lw_pt.txt
Chunked into 49 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.26s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.52s
  Import: 0.43s
  Total:  3.02s

Processing Episode 169...
Loading transcript for episode 169: transcripts\Naruhodo _169 - Pessoas que publicam frases motiva_A0MXsp7KA4A_pt.txt
Chunked into 53 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.67s/it]


  Load:   0.07s
  Chunk:  0.02s
  Embed:  3.35s
  Import: 0.53s
  Total:  3.97s

Processing Episode 170...
Loading transcript for episode 170: transcripts\Naruhodo _170 - Para conseguir algo basta acredita_3ln4vHUiFGE_pt.txt
Chunked into 64 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.94s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  3.90s
  Import: 0.85s
  Total:  4.83s

Processing Episode 171...
Loading transcript for episode 171: transcripts\Naruhodo _171 - Por que o tamanho do papel sulfite_R0eCP3wmqWs_pt.txt
Chunked into 35 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.09s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  2.19s
  Import: 0.76s
  Total:  3.04s

Processing Episode 172...
Loading transcript for episode 172: transcripts\Naruhodo _172 - Por que as nuvens têm o formato de_MYmiVVGtQEc_pt.txt
Chunked into 60 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.95s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  3.91s
  Import: 0.89s
  Total:  4.90s

Processing Episode 173...
Loading transcript for episode 173: transcripts\Naruhodo _173 - O que são cacoetes__-Z3U2fqEYaI_pt.txt
Chunked into 54 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.52s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.04s
  Import: 0.74s
  Total:  3.88s

Processing Episode 174...
Loading transcript for episode 174: transcripts\Naruhodo _174 - Por que temos o ímpeto de pular de_Vek4N4QS04Q_pt.txt
Chunked into 53 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.74s/it]


  Load:   0.08s
  Chunk:  0.06s
  Embed:  3.50s
  Import: 0.64s
  Total:  4.28s

Processing Episode 175...
Loading transcript for episode 175: transcripts\Naruhodo _175 - Jogar videogame deixa as pessoas m_Lr2Ivgzg86k_pt.txt
Chunked into 52 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.47s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  2.95s
  Import: 0.58s
  Total:  3.63s

Processing Episode 176...
Loading transcript for episode 176: transcripts\Naruhodo _176 - Jogar videogame deixa as pessoas m_Iyd7mbTR9DM_pt.txt
Chunked into 55 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.57s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  3.15s
  Import: 0.76s
  Total:  3.99s

Processing Episode 177...
Loading transcript for episode 177: transcripts\Naruhodo _177 - Por que criamos amigos imaginários_1uERKcVNJvg_pt.txt
Chunked into 38 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.28s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  2.57s
  Import: 0.62s
  Total:  3.26s

Processing Episode 178...
Loading transcript for episode 178: transcripts\Naruhodo _178 - O que é ser normal__UY-AEqU59xY_pt.txt
Chunked into 69 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.75s/it]


  Load:   0.08s
  Chunk:  0.04s
  Embed:  5.26s
  Import: 1.06s
  Total:  6.43s

Processing Episode 179...
Loading transcript for episode 179: transcripts\Naruhodo _179 - Por que ouvimos algumas músicas mu_Jni5DjpF6MQ_pt.txt
Chunked into 48 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.43s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  2.87s
  Import: 0.56s
  Total:  3.52s

Processing Episode 180...
Loading transcript for episode 180: transcripts\Naruhodo _180 - Pessoas com letra feia são mais in_NcQxXNThlns_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.68s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  5.07s
  Import: 0.94s
  Total:  6.09s

Processing Episode 181...
Loading transcript for episode 181: transcripts\Naruhodo _181 - Por que soluçamos__NfDCO9_IEoA_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.54s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  4.61s
  Import: 0.88s
  Total:  5.59s

Processing Episode 183...
Loading transcript for episode 183: transcripts\Naruhodo _183 - É possível juntar exatas_ humanas _9oqajpETpt4_pt.txt
Chunked into 66 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.40s/it]


  Load:   0.08s
  Chunk:  0.04s
  Embed:  4.22s
  Import: 1.40s
  Total:  5.74s

Processing Episode 184...
Loading transcript for episode 184: transcripts\Naruhodo _184 - É possível juntar exatas_ humanas _VPt2fTNFnOs_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.91s/it]


  Load:   0.13s
  Chunk:  0.08s
  Embed:  5.75s
  Import: 1.17s
  Total:  7.13s

Processing Episode 185...
Loading transcript for episode 185: transcripts\Naruhodo _185 - O que é MMS e por que ele deve ser_yb9I1yY-TBM_pt.txt
Chunked into 70 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.44s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.39s
  Import: 0.86s
  Total:  5.37s

Processing Episode 186...
Loading transcript for episode 186: transcripts\Naruhodo _186 - O que são as 4 causas de Aristótel_GQnAQGbMpXc_pt.txt
Chunked into 59 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.83s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.67s
  Import: 0.85s
  Total:  4.63s

Processing Episode 187...
Loading transcript for episode 187: transcripts\Naruhodo _187 - Por que procrastinamos__gwALLmR3VYw_pt.txt
Chunked into 66 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.56s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  4.70s
  Import: 0.96s
  Total:  5.75s

Processing Episode 188...
Loading transcript for episode 188: transcripts\Naruhodo _188 - Contar carneirinhos faz a gente do_Txu8-QTZB7I_pt.txt
Chunked into 62 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.90s/it]


  Load:   0.08s
  Chunk:  0.04s
  Embed:  3.82s
  Import: 0.87s
  Total:  4.81s

Processing Episode 189...
Loading transcript for episode 189: transcripts\Naruhodo _189 - Por que reviramos os olhos__iJXFS72FDZI_pt.txt
Chunked into 62 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:04<00:00,  2.06s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  4.13s
  Import: 0.92s
  Total:  5.15s

Processing Episode 190...
Loading transcript for episode 190: transcripts\Naruhodo _190 - Por que vozes e rostos às vezes pa_7EKTEUbD6fw_pt.txt
Chunked into 55 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.84s/it]


  Load:   0.05s
  Chunk:  0.04s
  Embed:  3.68s
  Import: 0.86s
  Total:  4.63s

Processing Episode 191...
Loading transcript for episode 191: transcripts\Naruhodo _191 - É possível aprender idiomas dormin_nz1FoXN8XqA_pt.txt
Chunked into 58 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:04<00:00,  2.19s/it]


  Load:   0.06s
  Chunk:  0.06s
  Embed:  4.39s
  Import: 0.83s
  Total:  5.34s

Processing Episode 192...
Loading transcript for episode 192: transcripts\Naruhodo _192 - O que são Experiências de Quase-Mo_yLZcZt_AEiE_pt.txt
Chunked into 63 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:04<00:00,  2.06s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  4.14s
  Import: 0.94s
  Total:  5.17s

Processing Episode 193...
Loading transcript for episode 193: transcripts\Naruhodo _193 - Como funciona o daltonismo__o1JY6P4hLoA_pt.txt
Chunked into 59 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.92s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  3.85s
  Import: 0.82s
  Total:  4.77s

Processing Episode 194...
Loading transcript for episode 194: transcripts\Naruhodo _194 - Uma pessoa pode ser enterrada viva_YIp_NWQ3xrk_pt.txt
Chunked into 72 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.47s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.43s
  Import: 1.20s
  Total:  5.75s

Processing Episode 195...
Loading transcript for episode 195: transcripts\Naruhodo _195 - Beber suco de frutas aumenta o ris_stfnGaOX1i8_pt.txt
Chunked into 66 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.34s/it]


  Load:   0.11s
  Chunk:  0.23s
  Embed:  4.05s
  Import: 0.78s
  Total:  5.16s

Processing Episode 196...
Loading transcript for episode 196: transcripts\Naruhodo _196 - Por que colecionamos coisas__wtBSKMxua1k_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.79s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  5.37s
  Import: 1.20s
  Total:  6.66s

Processing Episode 197...
Loading transcript for episode 197: transcripts\Naruhodo _197 - Adoçantes artificiais podem fazer _e5-NEiKDfgA_pt.txt
Chunked into 50 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.53s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.05s
  Import: 0.60s
  Total:  3.75s

Processing Episode 198...
Loading transcript for episode 198: transcripts\Naruhodo _198 - Existe instinto materno_ - Parte 1_bIYkqfyuY7M_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.41s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  4.24s
  Import: 0.92s
  Total:  5.26s

Processing Episode 199...
Loading transcript for episode 199: transcripts\Naruhodo _199 - Existe instinto materno_ - Parte 2_PbyjY7DKf_g_pt.txt
Chunked into 57 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.62s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  3.24s
  Import: 0.70s
  Total:  4.03s

Processing Episode 200...
Loading transcript for episode 200: transcripts\Naruhodo _200 - Desafio Naruhodo_ Quantas pessoas _yMAtMHChdVg_pt.txt
Chunked into 59 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.71s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  3.46s
  Import: 0.71s
  Total:  4.24s

Processing Episode 201...
Loading transcript for episode 201: transcripts\Naruhodo _201 - Por que o nosso cérebro às vezes f_F-0b6YG9Zg0_pt.txt
Chunked into 66 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.27s/it]


  Load:   0.04s
  Chunk:  0.03s
  Embed:  3.80s
  Import: 0.80s
  Total:  4.68s

Processing Episode 202...
Loading transcript for episode 202: transcripts\Naruhodo _202 - Especial Prêmio Ig Nobel 2019 - Pa_LAqYysdSaLs_pt.txt
Chunked into 60 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.82s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  3.65s
  Import: 0.73s
  Total:  4.46s

Processing Episode 203...
Loading transcript for episode 203: transcripts\Naruhodo _203 - Especial Prêmio Ig Nobel 2019 - Pa_jv5klxlMMxM_pt.txt
Chunked into 53 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:04<00:00,  2.14s/it]


  Load:   0.07s
  Chunk:  0.01s
  Embed:  4.29s
  Import: 0.74s
  Total:  5.11s

Processing Episode 204...
Loading transcript for episode 204: transcripts\Naruhodo _204 - Tudo que percebemos está no passad_GhUx4FCuTZk_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.39s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  4.16s
  Import: 0.96s
  Total:  5.20s

Processing Episode 205...
Loading transcript for episode 205: transcripts\Naruhodo _205 - Powerpoint é útil para a aprendiza_IFej_pgh9S8_pt.txt
Chunked into 84 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.54s/it]


  Load:   0.07s
  Chunk:  0.07s
  Embed:  4.64s
  Import: 1.37s
  Total:  6.16s

Processing Episode 206...
Loading transcript for episode 206: transcripts\Naruhodo _206 - Por que choramos__zWorZ-zK-c4_pt.txt
Chunked into 64 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:04<00:00,  2.01s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  4.02s
  Import: 1.34s
  Total:  5.46s

Processing Episode 207...
Loading transcript for episode 207: transcripts\Naruhodo _207 - Vape e cigarro eletrônico são segu_Raa9CUrIFbs_pt.txt
Chunked into 56 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.89s/it]


  Load:   0.14s
  Chunk:  0.07s
  Embed:  3.80s
  Import: 0.67s
  Total:  4.69s

Processing Episode 208...
Loading transcript for episode 208: transcripts\Naruhodo _208 - Qual o efeito da publicidade sobre_E2s-p8D0MTc_pt.txt
Chunked into 67 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.25s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.76s
  Import: 0.79s
  Total:  4.65s

Processing Episode 209...
Loading transcript for episode 209: transcripts\Naruhodo _209 - Qual o efeito da publicidade sobre_gS3Sc21lEZU_pt.txt
Chunked into 52 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.52s/it]


  Load:   0.05s
  Chunk:  0.02s
  Embed:  3.04s
  Import: 0.84s
  Total:  3.95s

Processing Episode 210...
Loading transcript for episode 210: transcripts\Naruhodo _210 - Como funciona a gagueira__R97l00EysqI_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.40s/it]


  Load:   0.05s
  Chunk:  0.04s
  Embed:  4.21s
  Import: 0.87s
  Total:  5.17s

Processing Episode 211...
Loading transcript for episode 211: transcripts\Naruhodo _211 - O que são pessoas superdotadas__-B6_Np9YaSU_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.64s/it]


  Load:   0.08s
  Chunk:  0.07s
  Embed:  4.93s
  Import: 1.17s
  Total:  6.25s

Processing Episode 212...
Loading transcript for episode 212: transcripts\Naruhodo _212 - Cachorros podem se comunicar como _geb6PwSv1v8_pt.txt
Chunked into 67 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.37s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  4.12s
  Import: 0.85s
  Total:  5.06s

Processing Episode 213...
Loading transcript for episode 213: transcripts\Naruhodo _213 - Por que algumas pessoas conseguem _vVnnlT92tx4_pt.txt
Chunked into 58 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.81s/it]


  Load:   0.07s
  Chunk:  0.02s
  Embed:  3.64s
  Import: 0.81s
  Total:  4.53s

Processing Episode 214...
Loading transcript for episode 214: transcripts\Naruhodo _214 - Por que algumas pessoas conseguem _I4xtB6a3Ajw_pt.txt
Chunked into 60 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.97s/it]


  Load:   0.07s
  Chunk:  0.02s
  Embed:  3.95s
  Import: 1.00s
  Total:  5.03s

Processing Episode 215...
Loading transcript for episode 215: transcripts\Naruhodo _215 - Por que uma multidão cantando pare_CJypXtz3bZM_pt.txt
Chunked into 56 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:04<00:00,  2.05s/it]


  Load:   0.08s
  Chunk:  0.06s
  Embed:  4.12s
  Import: 1.08s
  Total:  5.34s

Processing Episode 216...
Loading transcript for episode 216: transcripts\Naruhodo _216 - Por que sentimos ciúmes__oCSVc17yJ-g_pt.txt
Chunked into 76 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.55s/it]


  Load:   0.07s
  Chunk:  0.07s
  Embed:  4.65s
  Import: 1.10s
  Total:  5.89s

Processing Episode 217...
Loading transcript for episode 217: transcripts\Naruhodo _217 - Por que algumas pessoas tremem__K7KLyBBnK_Q_pt.txt
Chunked into 60 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.81s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  3.62s
  Import: 0.89s
  Total:  4.60s

Processing Episode 218...
Loading transcript for episode 218: transcripts\Naruhodo _218 - Existe a tal _sorte de principiant_WOWxol6g4kc_pt.txt
Chunked into 59 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.70s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  3.42s
  Import: 1.11s
  Total:  4.63s

Processing Episode 219...
Loading transcript for episode 219: transcripts\Naruhodo _219 - Flúor na água pode deixar as pesso_o1_OLYrLpas_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.54s/it]


  Load:   0.09s
  Chunk:  0.09s
  Embed:  4.63s
  Import: 1.03s
  Total:  5.85s

Processing Episode 220...
Loading transcript for episode 220: transcripts\Naruhodo _220 - Existe causa para a depressão_ - P_cFo8GFwyuR0_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.53s/it]


  Load:   0.08s
  Chunk:  0.04s
  Embed:  4.60s
  Import: 1.23s
  Total:  5.94s

Processing Episode 221...
Loading transcript for episode 221: transcripts\Naruhodo _221 - Existe causa para a depressão_ - P_5peXBmG43lU_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.50s/it]


  Load:   0.10s
  Chunk:  0.08s
  Embed:  4.51s
  Import: 1.12s
  Total:  5.81s

Processing Episode 222...
Loading transcript for episode 222: transcripts\Naruhodo _222 - Existe cognição quântica__J3jjmo7ly18_pt.txt
Chunked into 94 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:06<00:00,  2.00s/it]


  Load:   0.05s
  Chunk:  0.09s
  Embed:  6.01s
  Import: 1.65s
  Total:  7.80s

Processing Episode 223...
Loading transcript for episode 223: transcripts\Naruhodo _223 - Como funcionam os testes diagnósti_NyXY4gd8JgU_pt.txt
Chunked into 80 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.59s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.80s
  Import: 1.26s
  Total:  6.17s

Processing Episode 224...
Loading transcript for episode 224: transcripts\Naruhodo _224 - Por que espreguiçamos ao acordar__eL4cbQ3bnR4_pt.txt
Chunked into 51 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.57s/it]


  Load:   0.26s
  Chunk:  0.12s
  Embed:  3.17s
  Import: 1.00s
  Total:  4.56s

Processing Episode 225...
Loading transcript for episode 225: transcripts\Naruhodo _225 - A voz em nossa cabeça existe para _U0V8d99iodA_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:06<00:00,  2.01s/it]


  Load:   0.09s
  Chunk:  0.08s
  Embed:  6.04s
  Import: 1.21s
  Total:  7.42s

Processing Episode 226...
Loading transcript for episode 226: transcripts\Naruhodo _226 - Como lidar com epidemias__qZSiU9JLDlo_pt.txt
Chunked into 110 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 4/4 [00:06<00:00,  1.57s/it]


  Load:   0.10s
  Chunk:  0.15s
  Embed:  6.31s
  Import: 1.55s
  Total:  8.11s

Processing Episode 228...
Loading transcript for episode 228: transcripts\Naruhodo _228 - Todos nós enxergamos as mesmas cor_JCz4kD83KJc_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.41s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  4.23s
  Import: 0.98s
  Total:  5.33s

Processing Episode 229...
Loading transcript for episode 229: transcripts\Naruhodo _229 - O medo aumenta a produtividade no _HladRKLnJ_U_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.70s/it]


  Load:   0.07s
  Chunk:  0.07s
  Embed:  5.13s
  Import: 1.36s
  Total:  6.62s

Processing Episode 230...
Loading transcript for episode 230: transcripts\Naruhodo _230 - Por que quando olhamos para uma pe__86RQ8y_peY_pt.txt
Chunked into 67 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.33s/it]


  Load:   0.11s
  Chunk:  0.05s
  Embed:  4.00s
  Import: 1.02s
  Total:  5.17s

Processing Episode 231...
Loading transcript for episode 231: transcripts\Naruhodo _231 - Gêmeos têm a mesma impressão digit_AH5LQPW4lbI_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.39s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  4.18s
  Import: 1.00s
  Total:  5.28s

Processing Episode 232...
Loading transcript for episode 232: transcripts\Naruhodo _232 - Ser bilingue pode ser um problema__HVCl9ZV-8iY_pt.txt
Chunked into 80 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.57s/it]


  Load:   0.25s
  Chunk:  0.19s
  Embed:  4.73s
  Import: 1.61s
  Total:  6.77s

Processing Episode 233...
Loading transcript for episode 233: transcripts\Naruhodo _233 - O que é o _efeito cumbuca___fW6uoBmt83c_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.48s/it]


  Load:   0.09s
  Chunk:  0.03s
  Embed:  4.48s
  Import: 1.12s
  Total:  5.72s

Processing Episode 234...
Loading transcript for episode 234: transcripts\Naruhodo _234 - Assistir à TV de perto causa miopi_MB1Mg5sHxu8_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.88s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  5.65s
  Import: 1.52s
  Total:  7.27s

Processing Episode 236...
Loading transcript for episode 236: transcripts\Naruhodo _236 - Por que temos dor de cabeça__q8FtXVlSz1I_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.64s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.94s
  Import: 1.51s
  Total:  6.55s

Processing Episode 237...
Loading transcript for episode 237: transcripts\Naruhodo _237 - Por que não gostamos de esperar__5398FGSSBy8_pt.txt
Chunked into 62 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:04<00:00,  2.03s/it]


  Load:   0.08s
  Chunk:  0.05s
  Embed:  4.07s
  Import: 0.93s
  Total:  5.12s

Processing Episode 238...
Loading transcript for episode 238: transcripts\Naruhodo _238 - O distancionamento social impacta _SHKiDA21Uvc_pt.txt
Chunked into 61 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.92s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  3.84s
  Import: 0.90s
  Total:  4.83s

Processing Episode 239...
Loading transcript for episode 239: transcripts\Naruhodo _239 - O distancionamento social impacta _1Ya1lx7sueQ_pt.txt
Chunked into 70 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.26s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  3.79s
  Import: 0.73s
  Total:  4.62s

Processing Episode 240...
Loading transcript for episode 240: transcripts\Naruhodo _240 - Por que ruídos nos incomodam tanto_znMotnnX4LI_pt.txt
Chunked into 72 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.26s/it]


  Load:   0.07s
  Chunk:  0.02s
  Embed:  3.79s
  Import: 1.12s
  Total:  4.99s

Processing Episode 241...
Loading transcript for episode 241: transcripts\Naruhodo _241 - Por que as pessoas querem sempre t_q-A9IwOBIA4_pt.txt
Chunked into 72 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.35s/it]


  Load:   0.11s
  Chunk:  0.09s
  Embed:  4.07s
  Import: 0.95s
  Total:  5.22s

Processing Episode 242...
Loading transcript for episode 242: transcripts\Naruhodo _242 - O experimento do Parque dos Ratos _rBI0twj0wD4_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.29s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  3.87s
  Import: 0.95s
  Total:  4.93s

Processing Episode 243...
Loading transcript for episode 243: transcripts\Naruhodo _243 - Mover os olhos pode reprogramar su_fsFqXlEzIQg_pt.txt
Chunked into 75 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.27s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  3.82s
  Import: 0.94s
  Total:  4.87s

Processing Episode 244...
Loading transcript for episode 244: transcripts\Naruhodo _244 - Por que mentimos__Yk1xiLU8P98_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.37s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.11s
  Import: 1.87s
  Total:  6.10s

Processing Episode 245...
Loading transcript for episode 245: transcripts\Naruhodo _245 - Por que sempre tem espaço pro doce_mMRAGpdXEp8_pt.txt
Chunked into 72 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.31s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  3.95s
  Import: 0.88s
  Total:  4.95s

Processing Episode 246...
Loading transcript for episode 246: transcripts\Naruhodo _246 - O que os outros esperam de nós nos_q_AK3hUlJVw_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.36s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  4.07s
  Import: 0.94s
  Total:  5.10s

Processing Episode 247...
Loading transcript for episode 247: transcripts\Naruhodo _247 - O que é telemedicina e por que ela_GiohD1RwNRU_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.38s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  4.18s
  Import: 1.00s
  Total:  5.27s

Processing Episode 248...
Loading transcript for episode 248: transcripts\Naruhodo _248 - Meninos são de exatas e meninas sã_a1ORkfYYwm0_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.39s/it]


  Load:   0.05s
  Chunk:  0.04s
  Embed:  4.19s
  Import: 1.01s
  Total:  5.28s

Processing Episode 250...
Loading transcript for episode 250: transcripts\Naruhodo _250 - Desafio Naruhodo_ Será que Selma a_FyMBQhYFyP0_pt.txt
Chunked into 33 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.04s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  2.09s
  Import: 0.61s
  Total:  2.78s

Processing Episode 251...
Loading transcript for episode 251: transcripts\Naruhodo _251 - O que é a síndrome do impostor__tV2jAdJNg4Y_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.36s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.09s
  Import: 0.96s
  Total:  5.15s

Processing Episode 252...
Loading transcript for episode 252: transcripts\Naruhodo _252 - A pirâmide de Maslow faz sentido__ZtmxN--tVRY_pt.txt
Chunked into 84 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.43s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  4.30s
  Import: 1.01s
  Total:  5.41s

Processing Episode 253...
Loading transcript for episode 253: transcripts\Naruhodo _253 - Por que sentimos nosso corpo formi_tt3D2vtLfWI_pt.txt
Chunked into 75 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.28s/it]


  Load:   0.05s
  Chunk:  0.06s
  Embed:  3.86s
  Import: 1.53s
  Total:  5.50s

Processing Episode 254...
Loading transcript for episode 254: transcripts\Naruhodo _254 - Especial Prêmio IgNobel 2020 - Par_UPepIoQmdwU_pt.txt
Chunked into 69 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.28s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  3.85s
  Import: 0.80s
  Total:  4.75s

Processing Episode 255...
Loading transcript for episode 255: transcripts\Naruhodo _255 - Especial Prêmio IgNobel 2020 - Par_xLdFisNuu8o_pt.txt
Chunked into 66 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.10s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  3.29s
  Import: 0.71s
  Total:  4.08s

Processing Episode 256...
Loading transcript for episode 256: transcripts\Naruhodo _256 - Por que roncamos__SfJH_F2GsI4_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.28s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  3.86s
  Import: 0.92s
  Total:  4.86s

Processing Episode 257...
Loading transcript for episode 257: transcripts\Naruhodo _257 - Sons binaurais ajudam a nossa ment_Ut7zpXUd85A_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.23s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  3.70s
  Import: 0.77s
  Total:  4.57s

Processing Episode 258...
Loading transcript for episode 258: transcripts\Naruhodo _258 - Os carros precisam ser mais veloze_ZhZfTD2T0Kw_pt.txt
Chunked into 75 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.43s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  4.28s
  Import: 0.85s
  Total:  5.21s

Processing Episode 259...
Loading transcript for episode 259: transcripts\Naruhodo _259 - Por que as coisas parecem óbvias d_fsgAdq_iu-A_pt.txt
Chunked into 72 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.31s/it]


  Load:   0.08s
  Chunk:  0.02s
  Embed:  3.95s
  Import: 0.89s
  Total:  4.94s

Processing Episode 260...
Loading transcript for episode 260: transcripts\Naruhodo _260 - Por que as coisas parecem óbvias d_jWTaLWjT-ZU_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.28s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  3.86s
  Import: 1.00s
  Total:  4.96s

Processing Episode 261...
Loading transcript for episode 261: transcripts\Naruhodo _261 - O que a solidão pode causar nas pe_02dPRPGcqVs_pt.txt
Chunked into 76 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.34s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  4.03s
  Import: 1.14s
  Total:  5.27s

Processing Episode 262...
Loading transcript for episode 262: transcripts\Naruhodo _262 - Por que damos mais atenção às notí_umjcdvz3jeo_pt.txt
Chunked into 75 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.55s/it]


  Load:   0.07s
  Chunk:  0.11s
  Embed:  4.66s
  Import: 1.00s
  Total:  5.84s

Processing Episode 263...
Loading transcript for episode 263: transcripts\Naruhodo _263 - O que é transumanismo__Ni9JH0IzxBY_pt.txt
Chunked into 76 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.32s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.98s
  Import: 1.10s
  Total:  5.18s

Processing Episode 264...
Loading transcript for episode 264: transcripts\Naruhodo _264 - Por que é importante conhecer noss_5D3ezsGM_5s_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.36s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.08s
  Import: 0.99s
  Total:  5.19s

Processing Episode 265...
Loading transcript for episode 265: transcripts\Naruhodo _265 - Como funcionam os testes de vacina_XfQEeekD3iY_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.58s/it]


  Load:   0.07s
  Chunk:  0.06s
  Embed:  4.75s
  Import: 1.19s
  Total:  6.06s

Processing Episode 267...
Loading transcript for episode 267: transcripts\Naruhodo _267 - O que é dissonância cognitiva__1xJwqmir5Uw_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.46s/it]


  Load:   0.04s
  Chunk:  0.05s
  Embed:  4.38s
  Import: 0.96s
  Total:  5.43s

Processing Episode 268...
Loading transcript for episode 268: transcripts\Naruhodo _268 - O que é dissonância cognitiva_ - P_--OHlHmOQTM_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.42s/it]


  Load:   0.04s
  Chunk:  0.04s
  Embed:  4.27s
  Import: 1.01s
  Total:  5.38s

Processing Episode 269...
Loading transcript for episode 269: transcripts\Naruhodo _269 - Por que existe a escuridão da noit_X-EiqxacnTo_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.33s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  4.01s
  Import: 0.99s
  Total:  5.11s

Processing Episode 272...
Loading transcript for episode 272: transcripts\Naruhodo _272 - Quais são os grandes desafios da p_Kxt23k6HCa0_pt.txt
Chunked into 95 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.77s/it]


  Load:   0.07s
  Chunk:  0.06s
  Embed:  5.31s
  Import: 1.33s
  Total:  6.78s

Processing Episode 273...
Loading transcript for episode 273: transcripts\Naruhodo _273 - Por que temos dificuldades em comp_1nDaJ8_3Rs8_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.45s/it]


  Load:   0.06s
  Chunk:  0.06s
  Embed:  4.36s
  Import: 1.06s
  Total:  5.54s

Processing Episode 274...
Loading transcript for episode 274: transcripts\Naruhodo _274 - Tomar café reduz a massa cinzenta _7q5XePv09xY_pt.txt
Chunked into 86 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.52s/it]


  Load:   0.05s
  Chunk:  0.04s
  Embed:  4.57s
  Import: 1.11s
  Total:  5.76s

Processing Episode 275...
Loading transcript for episode 275: transcripts\Naruhodo _275 - Por que sorrimos__DhyeVD1gtjI_pt.txt
Chunked into 82 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.69s/it]


  Load:   0.08s
  Chunk:  0.04s
  Embed:  5.06s
  Import: 1.19s
  Total:  6.38s

Processing Episode 276...
Loading transcript for episode 276: transcripts\Naruhodo _276 - Por que xingamos__yIeIlct2DPI_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.54s/it]


  Load:   0.05s
  Chunk:  0.08s
  Embed:  4.64s
  Import: 1.23s
  Total:  6.01s

Processing Episode 278...
Loading transcript for episode 278: transcripts\Naruhodo _278 - O que é singularidade_ - Parte 2 d_euBpSfbX3lk_pt.txt
Chunked into 65 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.19s/it]


  Load:   0.10s
  Chunk:  0.10s
  Embed:  3.58s
  Import: 0.97s
  Total:  4.76s

Processing Episode 280...
Loading transcript for episode 280: transcripts\Naruhodo _280 - Por que as pessoas compartilham fa_-4N8SSKiUsU_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.48s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.43s
  Import: 1.04s
  Total:  5.58s

Processing Episode 281...
Loading transcript for episode 281: transcripts\Naruhodo _281 - Aprendemos mais quando somos punid_hy-SRTS9pW0_pt.txt
Chunked into 91 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.82s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  5.48s
  Import: 1.12s
  Total:  6.69s

Processing Episode 282...
Loading transcript for episode 282: transcripts\Naruhodo _282 - Anotar à mão é melhor que com comp_23WNys_AIIw_pt.txt
Chunked into 65 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.22s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.67s
  Import: 0.88s
  Total:  4.65s

Processing Episode 283...
Loading transcript for episode 283: transcripts\Naruhodo _283 - Como se formam os sotaques e as gí_JmXjRynAZqo_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.37s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  4.13s
  Import: 0.95s
  Total:  5.18s

Processing Episode 284...
Loading transcript for episode 284: transcripts\Naruhodo _284 - Qual o impacto do desemprego em no_L3UsqrjLmRA_pt.txt
Chunked into 82 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.40s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  4.20s
  Import: 1.50s
  Total:  5.78s

Processing Episode 285...
Loading transcript for episode 285: transcripts\Naruhodo _285 - Por que outras pessoas não entende_sG51qICc-ew_pt.txt
Chunked into 85 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.63s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  4.90s
  Import: 1.23s
  Total:  6.22s

Processing Episode 286...
Loading transcript for episode 286: transcripts\Naruhodo _286 - Por que sentimos vergonha_ - Parte_eDneD9_4rrE_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.57s/it]


  Load:   0.09s
  Chunk:  0.04s
  Embed:  4.71s
  Import: 1.16s
  Total:  6.00s

Processing Episode 287...
Loading transcript for episode 287: transcripts\Naruhodo _287 - Por que sentimos vergonha_ - Parte_L0K9LE8skyE_pt.txt
Chunked into 89 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.70s/it]


  Load:   0.06s
  Chunk:  0.09s
  Embed:  5.12s
  Import: 1.24s
  Total:  6.52s

Processing Episode 288...
Loading transcript for episode 288: transcripts\Naruhodo _288 - Por que existe a menopausa__3Ewwdi2guWg_pt.txt
Chunked into 82 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:06<00:00,  2.03s/it]


  Load:   0.16s
  Chunk:  0.13s
  Embed:  6.12s
  Import: 1.14s
  Total:  7.56s

Processing Episode 289...
Loading transcript for episode 289: transcripts\Naruhodo _289 - Ficamos parecidos com nossos pais _3U1fq9CSak0_pt.txt
Chunked into 86 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.57s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  4.72s
  Import: 1.12s
  Total:  5.92s

Processing Episode 290...
Loading transcript for episode 290: transcripts\Naruhodo _290 - O que é e para que serve o teste d_r3BMM6vO9r4_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.63s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.89s
  Import: 1.63s
  Total:  6.63s

Processing Episode 291...
Loading transcript for episode 291: transcripts\Naruhodo _291 - Por que preferimos certas marcas a_ho78PbF8LJ0_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.48s/it]


  Load:   0.08s
  Chunk:  0.09s
  Embed:  4.44s
  Import: 1.05s
  Total:  5.66s

Processing Episode 292...
Loading transcript for episode 292: transcripts\Naruhodo _292 - Por que mexemos as mãos quando fal_z4LJfliC46E_pt.txt
Chunked into 90 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.56s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  4.68s
  Import: 1.27s
  Total:  6.05s

Processing Episode 293...
Loading transcript for episode 293: transcripts\Naruhodo _293 - Por que spoilers nos afetam tanto__3ubW-iFTyOw_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.35s/it]


  Load:   0.30s
  Chunk:  0.05s
  Embed:  4.05s
  Import: 1.03s
  Total:  5.43s

Processing Episode 294...
Loading transcript for episode 294: transcripts\Naruhodo _294 - Por que apartamos brigas entre ani_BLuUGapu5oY_pt.txt
Chunked into 85 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.75s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  5.26s
  Import: 1.33s
  Total:  6.68s

Processing Episode 295...
Loading transcript for episode 295: transcripts\Naruhodo _295 - Tomar banho gelado faz bem pra saú_I9rvP5waSWE_pt.txt
Chunked into 88 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.59s/it]


  Load:   0.16s
  Chunk:  0.05s
  Embed:  4.77s
  Import: 1.27s
  Total:  6.26s

Processing Episode 296...
Loading transcript for episode 296: transcripts\Naruhodo _296 - Todas as pessoas imaginam as coisa_ZcNZ92bTZc4_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.37s/it]


  Load:   0.08s
  Chunk:  0.08s
  Embed:  4.12s
  Import: 1.17s
  Total:  5.44s

Processing Episode 297...
Loading transcript for episode 297: transcripts\Naruhodo _297 - Balançar de um lado para o outro a_LvuqqtayK60_pt.txt
Chunked into 86 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.70s/it]


  Load:   0.05s
  Chunk:  0.08s
  Embed:  5.13s
  Import: 1.68s
  Total:  6.93s

Processing Episode 298...
Loading transcript for episode 298: transcripts\Naruhodo _298 - Ouvir música clássica nos torna ma_xrhuAGOt_Ds_pt.txt
Chunked into 70 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.37s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.11s
  Import: 1.10s
  Total:  5.32s

Processing Episode 299...
Loading transcript for episode 299: transcripts\Naruhodo _299 - Como buscar fontes confiáveis sobr__uQCEsN8YOw_pt.txt
Chunked into 90 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.69s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  5.08s
  Import: 1.71s
  Total:  6.91s

Processing Episode 300...
Loading transcript for episode 300: transcripts\Naruhodo _300 - Desafio Naruhodo_ Alaor conseguirá_uQrlSZtbCZI_pt.txt
Chunked into 67 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.18s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  3.54s
  Import: 1.39s
  Total:  5.05s

Processing Episode 301...
Loading transcript for episode 301: transcripts\Naruhodo _301 - Somos tão bons em algo quanto acha_mpxo5ik1H9E_pt.txt
Chunked into 91 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.89s/it]


  Load:   0.09s
  Chunk:  0.06s
  Embed:  5.68s
  Import: 1.45s
  Total:  7.27s

Processing Episode 302...
Loading transcript for episode 302: transcripts\Naruhodo _302 - Prêmio IgNobel 2021 - Parte 1 de 2_tos9wQyGSTI_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.39s/it]


  Load:   0.08s
  Chunk:  0.05s
  Embed:  4.18s
  Import: 1.15s
  Total:  5.47s

Processing Episode 303...
Loading transcript for episode 303: transcripts\Naruhodo _303 - Prêmio IgNobel 2021 - Parte 2 de 2_D3QDkBx7_os_pt.txt
Chunked into 82 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.45s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  4.38s
  Import: 1.28s
  Total:  5.76s

Processing Episode 304...
Loading transcript for episode 304: transcripts\Naruhodo _304 - Como saber se uma pesquisa científ_Q-qrIWD_x2U_pt.txt
Chunked into 92 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.74s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  5.23s
  Import: 1.49s
  Total:  6.83s

Processing Episode 305...
Loading transcript for episode 305: transcripts\Naruhodo _305 - Por que seguimos líderes___HF-NDVn4ks_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.35s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.06s
  Import: 1.08s
  Total:  5.24s

Processing Episode 306...
Loading transcript for episode 306: transcripts\Naruhodo _306 - Sentir gratidão faz bem pra saúde__Trx6S76yCZk_pt.txt
Chunked into 97 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 4/4 [00:05<00:00,  1.31s/it]


  Load:   0.05s
  Chunk:  0.07s
  Embed:  5.25s
  Import: 1.32s
  Total:  6.69s

Processing Episode 307...
Loading transcript for episode 307: transcripts\Naruhodo _307 - Música melhora o desempenho em ati_Z336lQcrMGM_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.31s/it]


  Load:   0.05s
  Chunk:  0.04s
  Embed:  3.95s
  Import: 1.14s
  Total:  5.17s

Processing Episode 308...
Loading transcript for episode 308: transcripts\Naruhodo _308 - Alimentos reimosos fazem mal à saú_-6sRH-GZweU_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.65s/it]


  Load:   0.06s
  Chunk:  0.07s
  Embed:  4.96s
  Import: 1.62s
  Total:  6.71s

Processing Episode 309...
Loading transcript for episode 309: transcripts\Naruhodo _309 - Por que sentimos medo_ - Parte 1 d_xNwl26ZbVD8_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.31s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  3.93s
  Import: 1.22s
  Total:  5.26s

Processing Episode 310...
Loading transcript for episode 310: transcripts\Naruhodo _310 - Por que sentimos medo_ - Parte 2 d_cqkh5IdfQQM_pt.txt
Chunked into 86 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.49s/it]


  Load:   0.09s
  Chunk:  0.10s
  Embed:  4.46s
  Import: 1.82s
  Total:  6.46s

Processing Episode 311...
Loading transcript for episode 311: transcripts\Naruhodo _311 - O apego mãe-bebê é algo inato__vkcIZBPNOQg_pt.txt
Chunked into 88 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.62s/it]


  Load:   0.11s
  Chunk:  0.08s
  Embed:  4.87s
  Import: 1.38s
  Total:  6.43s

Processing Episode 312...
Loading transcript for episode 312: transcripts\Naruhodo _312 - Ficar sentado muito tempo aumenta _6ZFLoDFLFTY_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.50s/it]


  Load:   0.05s
  Chunk:  0.04s
  Embed:  4.50s
  Import: 1.14s
  Total:  5.72s

Processing Episode 313...
Loading transcript for episode 313: transcripts\Naruhodo _313 - Filtros de fotos em redes sociais _nf3nEIpLG6w_pt.txt
Chunked into 94 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.63s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.91s
  Import: 1.48s
  Total:  6.49s

Processing Episode 314...
Loading transcript for episode 314: transcripts\Naruhodo _314 - COVID 2021_ de onde viemos e para _oLDbaKxnMHI_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.52s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  4.60s
  Import: 1.63s
  Total:  6.31s

Processing Episode 315...
Loading transcript for episode 315: transcripts\Naruhodo _315 - COVID 2021_ de onde viemos e para _VkqwcNWzN7k_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.35s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.06s
  Import: 1.26s
  Total:  5.44s

Processing Episode 316...
Loading transcript for episode 316: transcripts\Naruhodo _316 - Como funciona a publicação de arti_i1FsPK-bb8k_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.50s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.50s
  Import: 1.40s
  Total:  6.02s

Processing Episode 317...
Loading transcript for episode 317: transcripts\Naruhodo _317 - Por que algumas pessoas têm _medo__qE_LIxYBX_4_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.42s/it]


  Load:   0.08s
  Chunk:  0.05s
  Embed:  4.28s
  Import: 1.24s
  Total:  5.65s

Processing Episode 318...
Loading transcript for episode 318: transcripts\Naruhodo _318 - Por que algumas pessoas acordam de_PKvR_lr5ZIw_pt.txt
Chunked into 88 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.72s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  5.18s
  Import: 1.34s
  Total:  6.64s

Processing Episode 319...
Loading transcript for episode 319: transcripts\Naruhodo _319 - O tempo passa mais rápido quando f_8xgBvsN0b_I_pt.txt
Chunked into 86 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.52s/it]


  Load:   0.09s
  Chunk:  0.06s
  Embed:  4.58s
  Import: 1.69s
  Total:  6.42s

Processing Episode 320...
Loading transcript for episode 320: transcripts\Naruhodo _320 - Por que nos identificamos com vilõ_ZH5aTG0xeLw_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.58s/it]


  Load:   0.06s
  Chunk:  0.07s
  Embed:  4.75s
  Import: 1.91s
  Total:  6.80s

Processing Episode 321...
Loading transcript for episode 321: transcripts\Naruhodo _321 - Debates virtuais são perda de temp_lL0uTqpBwjE_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.41s/it]


  Load:   0.08s
  Chunk:  0.08s
  Embed:  4.22s
  Import: 1.33s
  Total:  5.70s

Processing Episode 322...
Loading transcript for episode 322: transcripts\Naruhodo _322 - É possível apagar um poste de luz _0ThLd3C4vvE_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.41s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.24s
  Import: 1.60s
  Total:  5.96s

Processing Episode 323...
Loading transcript for episode 323: transcripts\Naruhodo _323 - Ho_oponopono e os Números de Grabo_6FqUqyVSH04_pt.txt
Chunked into 72 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.28s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  3.83s
  Import: 1.23s
  Total:  5.16s

Processing Episode 324...
Loading transcript for episode 324: transcripts\Naruhodo _324 - Por que sentimos nostalgia__UHajyH8RFSA_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.44s/it]


  Load:   0.07s
  Chunk:  0.06s
  Embed:  4.34s
  Import: 1.28s
  Total:  5.75s

Processing Episode 325...
Loading transcript for episode 325: transcripts\Naruhodo _325 - Por que nos apaixonamos por vilões_o9F4Q_jjF88_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.25s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.77s
  Import: 1.13s
  Total:  4.99s

Processing Episode 326...
Loading transcript for episode 326: transcripts\Naruhodo _326 - Por que nos apaixonamos por vilões_4gtkstkqpUw_pt.txt
Chunked into 69 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.34s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.03s
  Import: 1.17s
  Total:  5.29s

Processing Episode 327...
Loading transcript for episode 327: transcripts\Naruhodo _327 - Por que não esquecemos como andar __lj431qvU84_pt.txt
Chunked into 94 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.78s/it]


  Load:   0.07s
  Chunk:  0.08s
  Embed:  5.35s
  Import: 1.66s
  Total:  7.15s

Processing Episode 328...
Loading transcript for episode 328: transcripts\Naruhodo _328 - Existem _gatilhos mentais___fxBQJlin8Z4_pt.txt
Chunked into 76 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.35s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  4.07s
  Import: 1.76s
  Total:  5.92s

Processing Episode 329...
Loading transcript for episode 329: transcripts\Naruhodo _329 - Por que mensagem em CAIXA ALTA par_l3kj47llbfo_pt.txt
Chunked into 80 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.42s/it]


  Load:   0.07s
  Chunk:  0.06s
  Embed:  4.26s
  Import: 1.60s
  Total:  5.99s

Processing Episode 330...
Loading transcript for episode 330: transcripts\Naruhodo _330 - O teste palográfico funciona__zkkLLKB4LX8_pt.txt
Chunked into 86 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.55s/it]


  Load:   0.07s
  Chunk:  0.06s
  Embed:  4.65s
  Import: 1.85s
  Total:  6.63s

Processing Episode 331...
Loading transcript for episode 331: transcripts\Naruhodo _331 - Pessoas sincronizam suas mentes qu_-JjxblfNjRk_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.49s/it]


  Load:   0.07s
  Chunk:  0.06s
  Embed:  4.48s
  Import: 1.35s
  Total:  5.96s

Processing Episode 332...
Loading transcript for episode 332: transcripts\Naruhodo _332 - Todos os cachorros falam a mesma l_J8zTiqhZGWE_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.36s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  4.08s
  Import: 1.16s
  Total:  5.33s

Processing Episode 333...
Loading transcript for episode 333: transcripts\Naruhodo _333 - Quais as consequências do excesso _XMQdUfCWKEk_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.36s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.10s
  Import: 1.40s
  Total:  5.61s

Processing Episode 334...
Loading transcript for episode 334: transcripts\Naruhodo _334 - Por que caímos em golpes__Wc3FoXXzPSQ_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.63s/it]


  Load:   0.07s
  Chunk:  0.06s
  Embed:  4.88s
  Import: 1.78s
  Total:  6.79s

Processing Episode 335...
Loading transcript for episode 335: transcripts\Naruhodo _335 - Por que algumas pessoas têm prazer_2wCDVlZ83x8_pt.txt
Chunked into 84 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.52s/it]


  Load:   0.06s
  Chunk:  0.07s
  Embed:  4.56s
  Import: 1.63s
  Total:  6.32s

Processing Episode 336...
Loading transcript for episode 336: transcripts\Naruhodo _336 - Podemos confiar nas pesquisas elei_6pdBfd6Robo_pt.txt
Chunked into 96 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.73s/it]


  Load:   0.07s
  Chunk:  0.07s
  Embed:  5.21s
  Import: 1.61s
  Total:  6.95s

Processing Episode 337...
Loading transcript for episode 337: transcripts\Naruhodo _337 - Podemos confiar nas pesquisas elei_VxtAD9gINSQ_pt.txt
Chunked into 85 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.66s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  4.98s
  Import: 1.56s
  Total:  6.64s

Processing Episode 338...
Loading transcript for episode 338: transcripts\Naruhodo _338 - Por que fofocamos__ij9ocesTc50_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.55s/it]


  Load:   0.06s
  Chunk:  0.06s
  Embed:  4.67s
  Import: 1.52s
  Total:  6.31s

Processing Episode 339...
Loading transcript for episode 339: transcripts\Naruhodo _339 - Por que as coisas parecem girar qu_YmK1Yq0mwW8_pt.txt
Chunked into 85 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.50s/it]


  Load:   0.06s
  Chunk:  0.07s
  Embed:  4.51s
  Import: 1.74s
  Total:  6.37s

Processing Episode 340...
Loading transcript for episode 340: transcripts\Naruhodo _340 - Como se constrói a auto-estima__0ULx-CXmh7w_pt.txt
Chunked into 75 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.41s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  4.25s
  Import: 1.58s
  Total:  5.93s

Processing Episode 341...
Loading transcript for episode 341: transcripts\Naruhodo _341 - Cooperação entre seres vivos é alg_N4mL78Sm-_8_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.44s/it]


  Load:   0.08s
  Chunk:  0.06s
  Embed:  4.34s
  Import: 1.25s
  Total:  5.73s

Processing Episode 342...
Loading transcript for episode 342: transcripts\Naruhodo _342 - O que é e de onde vem a inspiração_Xg0vGC-uPwM_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.31s/it]


  Load:   0.21s
  Chunk:  0.06s
  Embed:  3.93s
  Import: 1.28s
  Total:  5.48s

Processing Episode 343...
Loading transcript for episode 343: transcripts\Naruhodo _343 - O que é e como funciona uma relaçã_rrF27pTFGg8_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.61s/it]


  Load:   0.26s
  Chunk:  0.06s
  Embed:  4.82s
  Import: 2.18s
  Total:  7.31s

Processing Episode 345...
Loading transcript for episode 345: transcripts\Naruhodo _345 - Por que às vezes sentimos as dores_mKdMBCqy6XA_pt.txt
Chunked into 84 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.51s/it]


  Load:   0.08s
  Chunk:  0.07s
  Embed:  4.54s
  Import: 1.76s
  Total:  6.45s

Processing Episode 346...
Loading transcript for episode 346: transcripts\Naruhodo _346 - Programação Neurolinguística _PNL__p9-iauANzY0_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.40s/it]


  Load:   0.05s
  Chunk:  0.03s
  Embed:  4.21s
  Import: 1.24s
  Total:  5.54s

Processing Episode 347...
Loading transcript for episode 347: transcripts\Naruhodo _347 - Programação Neurolinguística _PNL__yggQXOE9lRY_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.32s/it]


  Load:   0.07s
  Chunk:  0.18s
  Embed:  3.97s
  Import: 1.48s
  Total:  5.71s

Processing Episode 348...
Loading transcript for episode 348: transcripts\Naruhodo _348 - Sentir medo e ansiedade é algo rui_u30dN7ACvE4_pt.txt
Chunked into 91 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.82s/it]


  Load:   0.09s
  Chunk:  0.07s
  Embed:  5.46s
  Import: 1.53s
  Total:  7.15s

Processing Episode 349...
Loading transcript for episode 349: transcripts\Naruhodo _349 - O que são relações parassociais__k7qj3moiheg_pt.txt
Chunked into 80 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.40s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.19s
  Import: 1.27s
  Total:  5.57s

Processing Episode 350...
Loading transcript for episode 350: transcripts\Naruhodo _350 - Desafio_ Será que Anna conseguirá _nxG7_tqSGJE_pt.txt
Chunked into 55 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.32s/it]


  Load:   0.32s
  Chunk:  0.06s
  Embed:  2.65s
  Import: 0.93s
  Total:  3.96s

Processing Episode 351...
Loading transcript for episode 351: transcripts\Naruhodo _351 - A consciência é um campo energétic_7hb-kGwJbWo_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.46s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  4.39s
  Import: 1.72s
  Total:  6.21s

Processing Episode 352...
Loading transcript for episode 352: transcripts\Naruhodo _352 - Por que pedimos desculpas_ - Parte_jVzZ9dTAgGY_pt.txt
Chunked into 70 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.22s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  3.67s
  Import: 1.24s
  Total:  5.03s

Processing Episode 353...
Loading transcript for episode 353: transcripts\Naruhodo _353 - Por que pedimos desculpas_ - Parte_vvOMB66B5u0_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.34s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  4.04s
  Import: 1.43s
  Total:  5.58s

Processing Episode 354...
Loading transcript for episode 354: transcripts\Naruhodo _354 - Consumir vídeos e podcasts em velo_aX3IJO234tI_pt.txt
Chunked into 88 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.47s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.43s
  Import: 1.72s
  Total:  6.26s

Processing Episode 355...
Loading transcript for episode 355: transcripts\Naruhodo _355 - Prêmio IgNobel 2022 - Parte 1 de 2_KIx5uHKgHLs_pt.txt
Chunked into 60 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.65s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  3.33s
  Import: 1.08s
  Total:  4.50s

Processing Episode 356...
Loading transcript for episode 356: transcripts\Naruhodo _356 - Prêmio IgNobel 2022 - Parte 2 de 2_WIOVn1hDt8s_pt.txt
Chunked into 73 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.29s/it]


  Load:   0.05s
  Chunk:  0.04s
  Embed:  3.88s
  Import: 1.27s
  Total:  5.24s

Processing Episode 357...
Loading transcript for episode 357: transcripts\Naruhodo _357 - Existe possibilidade de consenso n_KhyKRnhjnbw_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.60s/it]


  Load:   0.09s
  Chunk:  0.07s
  Embed:  4.81s
  Import: 1.85s
  Total:  6.83s

Processing Episode 358...
Loading transcript for episode 358: transcripts\Naruhodo _358 - Para que serve o dinheiro__rkzRK1frpWM_pt.txt
Chunked into 89 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.57s/it]


  Load:   0.08s
  Chunk:  0.08s
  Embed:  4.71s
  Import: 1.83s
  Total:  6.70s

Processing Episode 359...
Loading transcript for episode 359: transcripts\Naruhodo _359 - Recompensas pagas ou loot boxes em_JfZ9gKuGIXI_pt.txt
Chunked into 80 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.33s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.00s
  Import: 1.72s
  Total:  5.83s

Processing Episode 360...
Loading transcript for episode 360: transcripts\Naruhodo _360 - O que é e como lidar com o bullyin_vyTcYk6f-bA_pt.txt
Chunked into 86 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.47s/it]


  Load:   0.09s
  Chunk:  0.07s
  Embed:  4.42s
  Import: 1.62s
  Total:  6.19s

Processing Episode 361...
Loading transcript for episode 361: transcripts\Naruhodo _361 - O que acontece quando tomamos um s_cEUb8czuY6Y_pt.txt
Chunked into 80 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.38s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.16s
  Import: 1.71s
  Total:  5.99s

Processing Episode 362...
Loading transcript for episode 362: transcripts\Naruhodo _362 - Reconhecer-se no espelho é sinal d_DYFYSiz6-lM_pt.txt
Chunked into 82 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.51s/it]


  Load:   0.06s
  Chunk:  0.08s
  Embed:  4.53s
  Import: 1.52s
  Total:  6.19s

Processing Episode 363...
Loading transcript for episode 363: transcripts\Naruhodo _363 - Jejum de dopamina funciona__908qoFZG8rY_pt.txt
Chunked into 89 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.52s/it]


  Load:   0.06s
  Chunk:  0.06s
  Embed:  4.57s
  Import: 1.82s
  Total:  6.51s

Processing Episode 364...
Loading transcript for episode 364: transcripts\Naruhodo _364 - O que é e quais são os impactos do_qgQpXhB3EZ8_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.31s/it]


  Load:   0.05s
  Chunk:  0.08s
  Embed:  3.94s
  Import: 1.71s
  Total:  5.78s

Processing Episode 365...
Loading transcript for episode 365: transcripts\Naruhodo _365 - O que é e quais são os impactos do_sDKUFSDgmXU_pt.txt
Chunked into 76 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.34s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.03s
  Import: 1.57s
  Total:  5.72s

Processing Episode 366...
Loading transcript for episode 366: transcripts\Naruhodo _366 - Por que temos ideias durante o ban_jYJUwNRZWHE_pt.txt
Chunked into 80 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.31s/it]


  Load:   0.06s
  Chunk:  0.07s
  Embed:  3.92s
  Import: 1.72s
  Total:  5.77s

Processing Episode 367...
Loading transcript for episode 367: transcripts\Naruhodo _367 - Estamos ficando mais esquecidos do_ouilklEqKAU_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.35s/it]


  Load:   0.05s
  Chunk:  0.04s
  Embed:  4.06s
  Import: 1.62s
  Total:  5.78s

Processing Episode 368...
Loading transcript for episode 368: transcripts\Naruhodo _368 - Água alcalina faz bem pra saúde__Lf_zJEnyzuU_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.36s/it]


  Load:   0.07s
  Chunk:  0.06s
  Embed:  4.09s
  Import: 2.17s
  Total:  6.38s

Processing Episode 369...
Loading transcript for episode 369: transcripts\Naruhodo _369 - É mais difícil fazer amigos quando_e7yl-9-T6xc_pt.txt
Chunked into 85 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.44s/it]


  Load:   0.08s
  Chunk:  0.07s
  Embed:  4.34s
  Import: 2.05s
  Total:  6.54s

Processing Episode 370...
Loading transcript for episode 370: transcripts\Naruhodo _370 - Homens que acham seu pênis pequeno_1BXyvGQ0HCc_pt.txt
Chunked into 76 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.31s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  3.95s
  Import: 1.49s
  Total:  5.56s

Processing Episode 371...
Loading transcript for episode 371: transcripts\Naruhodo _371 - Qual o impacto do alcoolismo nos d_JAIjJ6E8ZHk_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.43s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.30s
  Import: 1.78s
  Total:  6.20s

Processing Episode 372...
Loading transcript for episode 372: transcripts\Naruhodo _372 - Qual o impacto do alcoolismo nos d_ZRwC2GQevIo_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.40s/it]


  Load:   0.08s
  Chunk:  0.07s
  Embed:  4.19s
  Import: 1.60s
  Total:  5.94s

Processing Episode 373...
Loading transcript for episode 373: transcripts\Naruhodo _373 - Como funciona a carreira de cienti_8ZaQHTb-o4U_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.51s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.55s
  Import: 1.61s
  Total:  6.28s

Processing Episode 374...
Loading transcript for episode 374: transcripts\Naruhodo _374 - Ser uma pessoa bagunceira é um pro_mRwy8BhQtGk_pt.txt
Chunked into 73 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.36s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.09s
  Import: 1.74s
  Total:  5.93s

Processing Episode 375...
Loading transcript for episode 375: transcripts\Naruhodo _375 - Por que cutucamos o nariz__N_iB-EHHh5g_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.61s/it]


  Load:   0.09s
  Chunk:  0.07s
  Embed:  4.84s
  Import: 2.63s
  Total:  7.63s

Processing Episode 376...
Loading transcript for episode 376: transcripts\Naruhodo _376 - Como fazer alguém mudar de ideia__0xHqRuLjqPU_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:06<00:00,  2.05s/it]


  Load:   0.10s
  Chunk:  0.13s
  Embed:  6.19s
  Import: 3.27s
  Total:  9.69s

Processing Episode 377...
Loading transcript for episode 377: transcripts\Naruhodo _377 - Aprendemos melhor fazendo pausas__PZVVN9lHeno_pt.txt
Chunked into 70 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.64s/it]


  Load:   0.07s
  Chunk:  0.06s
  Embed:  4.94s
  Import: 2.06s
  Total:  7.13s

Processing Episode 378...
Loading transcript for episode 378: transcripts\Naruhodo _378 - Por que avisos de perigo não são s_lKabJ3lQOHU_pt.txt
Chunked into 82 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.82s/it]


  Load:   0.09s
  Chunk:  0.06s
  Embed:  5.48s
  Import: 1.67s
  Total:  7.29s

Processing Episode 379...
Loading transcript for episode 379: transcripts\Naruhodo _379 - Como nós nos tornamos nós__fI9rqAJfcUU_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.48s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  4.43s
  Import: 1.97s
  Total:  6.51s

Processing Episode 380...
Loading transcript for episode 380: transcripts\Naruhodo _380 - Por que temos animais domésticos_ ___zJRw5Fcw8_pt.txt
Chunked into 73 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.60s/it]


  Load:   0.09s
  Chunk:  0.05s
  Embed:  4.80s
  Import: 2.21s
  Total:  7.15s

Processing Episode 381...
Loading transcript for episode 381: transcripts\Naruhodo _381 - Por que temos animais domésticos_ _gjS_GVsL3tw_pt.txt
Chunked into 85 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.62s/it]


  Load:   0.11s
  Chunk:  0.12s
  Embed:  4.88s
  Import: 2.12s
  Total:  7.23s

Processing Episode 382...
Loading transcript for episode 382: transcripts\Naruhodo _382 - Quem ama o feio bonito lhe parece__xI_DO_epNMg_pt.txt
Chunked into 93 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.69s/it]


  Load:   0.08s
  Chunk:  0.05s
  Embed:  5.06s
  Import: 1.91s
  Total:  7.10s

Processing Episode 383...
Loading transcript for episode 383: transcripts\Naruhodo _383 - Por que beijamos na boca__PLXIShSn61Q_pt.txt
Chunked into 66 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.29s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  3.88s
  Import: 1.59s
  Total:  5.58s

Processing Episode 384...
Loading transcript for episode 384: transcripts\Naruhodo _384 - Por que tomamos choque quando enco_DhKsqKRHwsw_pt.txt
Chunked into 89 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.83s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  5.51s
  Import: 2.18s
  Total:  7.81s

Processing Episode 385...
Loading transcript for episode 385: transcripts\Naruhodo _385 - O que é o fenômeno da _melhora da _B0F0nV_dwwI_pt.txt
Chunked into 76 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.51s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.54s
  Import: 1.47s
  Total:  6.13s

Processing Episode 386...
Loading transcript for episode 386: transcripts\Naruhodo _386 - Como é calculada a pontuação do EN_o_4DbSVo9Rk_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.60s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.80s
  Import: 1.90s
  Total:  6.81s

Processing Episode 387...
Loading transcript for episode 387: transcripts\Naruhodo _387 - Somos bons _ou maus_ por natureza__Fx37e0PUgY4_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.59s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.78s
  Import: 2.28s
  Total:  7.16s

Processing Episode 388...
Loading transcript for episode 388: transcripts\Naruhodo _388 - Somos bons _ou maus_ por natureza__xwAEaMyfm0Q_pt.txt
Chunked into 83 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.71s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  5.13s
  Import: 2.14s
  Total:  7.39s

Processing Episode 389...
Loading transcript for episode 389: transcripts\Naruhodo _389 - Por que repetir palavras deixa ela_JKN89pAb10U_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.55s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  4.64s
  Import: 2.32s
  Total:  7.06s

Processing Episode 390...
Loading transcript for episode 390: transcripts\Naruhodo _390 - Fazer compras para si mesma melhor_k98xTfw9gTs_pt.txt
Chunked into 68 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.25s/it]


  Load:   0.40s
  Chunk:  0.03s
  Embed:  3.75s
  Import: 1.76s
  Total:  5.94s

Processing Episode 391...
Loading transcript for episode 391: transcripts\Naruhodo _391 - Por que sentimos inveja__tNZAJS7udrY_pt.txt
Chunked into 86 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.57s/it]


  Load:   0.08s
  Chunk:  0.07s
  Embed:  4.71s
  Import: 2.12s
  Total:  6.99s

Processing Episode 392...
Loading transcript for episode 392: transcripts\Naruhodo _392 - Sentimos atração por pessoas parec_4xHdHpYDTkM_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.43s/it]


  Load:   0.05s
  Chunk:  0.07s
  Embed:  4.28s
  Import: 1.70s
  Total:  6.10s

Processing Episode 393...
Loading transcript for episode 393: transcripts\Naruhodo _393 - A psicologia positiva tem validade_LnSZCHHfoWI_pt.txt
Chunked into 87 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.52s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  4.57s
  Import: 1.85s
  Total:  6.52s

Processing Episode 394...
Loading transcript for episode 394: transcripts\Naruhodo _394 - A psicologia positiva tem validade_n8h3zC7YLNs_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.44s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.35s
  Import: 1.85s
  Total:  6.31s

Processing Episode 395...
Loading transcript for episode 395: transcripts\Naruhodo _395 - O que é força de vontade__5bR1RNVo7kM_pt.txt
Chunked into 76 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.46s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  4.41s
  Import: 1.51s
  Total:  6.00s

Processing Episode 396...
Loading transcript for episode 396: transcripts\Naruhodo _396 - O que fazer frente ao aquecimento _RchVGabxOdo_pt.txt
Chunked into 82 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.48s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  4.43s
  Import: 2.00s
  Total:  6.53s

Processing Episode 397...
Loading transcript for episode 397: transcripts\Naruhodo _397 - Por que ficamos entediados__FAZ9BPv_6O4_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.44s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  4.35s
  Import: 1.63s
  Total:  6.08s

Processing Episode 398...
Loading transcript for episode 398: transcripts\Naruhodo _398 - Jejum intermitente funciona__lTkWGFFkOLo_pt.txt
Chunked into 90 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.81s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  5.44s
  Import: 2.01s
  Total:  7.55s

Processing Episode 399...
Loading transcript for episode 399: transcripts\Naruhodo _399 - Assistir à pornografia vicia__vByA0QVSOb8_pt.txt
Chunked into 75 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.47s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.42s
  Import: 1.73s
  Total:  6.26s

Processing Episode 400...
Loading transcript for episode 400: transcripts\Naruhodo _400 - Luminar resolverá o enigma que tra_4tPEyrJIKUE_pt.txt
Chunked into 63 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.72s/it]


  Load:   0.06s
  Chunk:  0.06s
  Embed:  3.44s
  Import: 1.38s
  Total:  4.95s

Processing Episode 401...
Loading transcript for episode 401: transcripts\Naruhodo _401 - Prêmio IgNobel 2023 - Parte 1 de 2_4ZyMMzb1iSo_pt.txt
Chunked into 61 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.72s/it]


  Load:   0.07s
  Chunk:  0.03s
  Embed:  3.44s
  Import: 1.80s
  Total:  5.34s

Processing Episode 402...
Loading transcript for episode 402: transcripts\Naruhodo _402 - Prêmio IgNobel 2023 - Parte 2 de 2_Y9Hw6yw7sw8_pt.txt
Chunked into 62 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.76s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  3.55s
  Import: 1.20s
  Total:  4.87s

Processing Episode 404...
Loading transcript for episode 404: transcripts\Naruhodo _404 - Por que algumas pessoas gostam de _pTSZ--4TKMk_pt.txt
Chunked into 95 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.74s/it]


  Load:   0.07s
  Chunk:  0.12s
  Embed:  5.24s
  Import: 1.98s
  Total:  7.41s

Processing Episode 405...
Loading transcript for episode 405: transcripts\Naruhodo _405 - O que é o infinito__Vdu5LRFKa-M_pt.txt
Chunked into 95 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.73s/it]


  Load:   0.06s
  Chunk:  0.06s
  Embed:  5.21s
  Import: 2.46s
  Total:  7.79s

Processing Episode 406...
Loading transcript for episode 406: transcripts\Naruhodo _406 - As fases do luto têm validade cien_VltGGsSfNsI_pt.txt
Chunked into 94 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.72s/it]


  Load:   0.07s
  Chunk:  0.08s
  Embed:  5.15s
  Import: 2.22s
  Total:  7.53s

Processing Episode 407...
Loading transcript for episode 407: transcripts\Naruhodo _407 - Existe razão sem emoção__qUxluRrHV3E_pt.txt
Chunked into 95 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.83s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  5.49s
  Import: 1.93s
  Total:  7.53s

Processing Episode 408...
Loading transcript for episode 408: transcripts\Naruhodo _408 - Por que alguns GIFs parecem ter so_A26IemIBxwQ_pt.txt
Chunked into 71 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.32s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  3.95s
  Import: 2.07s
  Total:  6.14s

Processing Episode 409...
Loading transcript for episode 409: transcripts\Naruhodo _409 - Por que contamos piadas__yil8nUgex2g_pt.txt
Chunked into 89 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.60s/it]


  Load:   0.08s
  Chunk:  0.07s
  Embed:  4.81s
  Import: 2.48s
  Total:  7.43s

Processing Episode 410...
Loading transcript for episode 410: transcripts\Naruhodo _410 - Por que cães correm atrás de veícu_5WFh2g_epOU_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.44s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  4.34s
  Import: 1.73s
  Total:  6.17s

Processing Episode 411...
Loading transcript for episode 411: transcripts\Naruhodo _411 - Por que traímos_ - Parte 1 de 2_kVruX3Mhxig_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.46s/it]


  Load:   0.06s
  Chunk:  0.06s
  Embed:  4.40s
  Import: 2.09s
  Total:  6.61s

Processing Episode 412...
Loading transcript for episode 412: transcripts\Naruhodo _412 - Por que traímos_ - Parte 2 de 2_Towh8afX65Y_pt.txt
Chunked into 84 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.78s/it]


  Load:   0.05s
  Chunk:  0.05s
  Embed:  5.36s
  Import: 2.23s
  Total:  7.69s

Processing Episode 413...
Loading transcript for episode 413: transcripts\Naruhodo _413 - Como sabemos se uma coisa é causa _335oKVAm99Y_pt.txt
Chunked into 85 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.82s/it]


  Load:   0.06s
  Chunk:  0.08s
  Embed:  5.47s
  Import: 2.22s
  Total:  7.84s

Processing Episode 414...
Loading transcript for episode 414: transcripts\Naruhodo _414 - A educação científica salvará o mu_OLaBswwX9yM_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.75s/it]


  Load:   0.07s
  Chunk:  0.07s
  Embed:  5.24s
  Import: 1.79s
  Total:  7.18s

Processing Episode 415...
Loading transcript for episode 415: transcripts\Naruhodo _415 - Subir escadas pode ajudar pessoas _jqhtO6W03Cc_pt.txt
Chunked into 81 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.87s/it]


  Load:   0.07s
  Chunk:  0.07s
  Embed:  5.61s
  Import: 2.26s
  Total:  8.01s

Processing Episode 416...
Loading transcript for episode 416: transcripts\Naruhodo _416 - Saunas melhoram a saúde__KSAdxaKvZkI_pt.txt
Chunked into 74 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.38s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.14s
  Import: 1.65s
  Total:  5.89s

Processing Episode 417...
Loading transcript for episode 417: transcripts\Naruhodo _417 - Por que respondemos desastres com _UYJ2eKpZX_4_pt.txt
Chunked into 93 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.89s/it]


  Load:   0.08s
  Chunk:  0.07s
  Embed:  5.67s
  Import: 2.37s
  Total:  8.19s

Processing Episode 418...
Loading transcript for episode 418: transcripts\Naruhodo _418 - O que é a birra__Kywng9Fi7bo_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.34s/it]


  Load:   0.07s
  Chunk:  0.04s
  Embed:  4.05s
  Import: 1.79s
  Total:  5.95s

Processing Episode 419...
Loading transcript for episode 419: transcripts\Naruhodo _419 - Maconha faz mal_ - Parte 1 de 2_cvLTh2bKPiQ_pt.txt
Chunked into 82 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.38s/it]


  Load:   0.05s
  Chunk:  0.07s
  Embed:  4.16s
  Import: 2.45s
  Total:  6.73s

Processing Episode 420...
Loading transcript for episode 420: transcripts\Naruhodo _420 - Maconha faz mal_ - Parte 2 de 2_F7wVcGvpoGA_pt.txt
Chunked into 72 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:03<00:00,  1.29s/it]


  Load:   0.07s
  Chunk:  0.09s
  Embed:  3.90s
  Import: 1.78s
  Total:  5.84s

Processing Episode 421...
Loading transcript for episode 421: transcripts\Naruhodo _421 - Por que guardamos segredos__fARgZ3WrnVk_pt.txt
Chunked into 80 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.33s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.01s
  Import: 1.78s
  Total:  5.89s

Processing Episode 422...
Loading transcript for episode 422: transcripts\Naruhodo _422 - Crianças acreditam em contos de fa_T1XyI8YURug_pt.txt
Chunked into 88 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.56s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  4.67s
  Import: 1.87s
  Total:  6.64s

Processing Episode 423...
Loading transcript for episode 423: transcripts\Naruhodo _423 - Lentes que filtram a cor azul são _cYpaucMUZSc_pt.txt
Chunked into 72 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.37s/it]


  Load:   0.06s
  Chunk:  0.05s
  Embed:  4.13s
  Import: 1.63s
  Total:  5.87s

Processing Episode 424...
Loading transcript for episode 424: transcripts\Naruhodo _424 - O que é competitividade_ - Parte 1_noPHBDvkDUc_pt.txt
Chunked into 78 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.41s/it]


  Load:   0.07s
  Chunk:  0.05s
  Embed:  4.22s
  Import: 1.86s
  Total:  6.19s

Processing Episode 425...
Loading transcript for episode 425: transcripts\Naruhodo _425 - O que é competitividade_ - Parte 2_bMkLimosW0E_pt.txt
Chunked into 77 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.42s/it]


  Load:   0.07s
  Chunk:  0.07s
  Embed:  4.26s
  Import: 1.94s
  Total:  6.35s

Processing Episode 426...
Loading transcript for episode 426: transcripts\Naruhodo _426 - Seu nome pode afetar a forma do se_8p2TLmaXHCE_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.45s/it]


  Load:   0.08s
  Chunk:  0.06s
  Embed:  4.36s
  Import: 2.01s
  Total:  6.50s

Processing Episode 427...
Loading transcript for episode 427: transcripts\Naruhodo _427 - Prêmio IgNobel 2024 - Parte 1 de 2_OC5NmqIbT9o_pt.txt
Chunked into 54 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:02<00:00,  1.41s/it]


  Load:   0.06s
  Chunk:  0.02s
  Embed:  2.83s
  Import: 1.27s
  Total:  4.18s

Processing Episode 428...
Loading transcript for episode 428: transcripts\Naruhodo _428 - Prêmio IgNobel 2024 - Parte 2 de 2_oZi57dhEgQ0_pt.txt
Chunked into 61 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 2/2 [00:03<00:00,  1.66s/it]


  Load:   0.19s
  Chunk:  0.06s
  Embed:  3.33s
  Import: 1.34s
  Total:  4.91s

Processing Episode 429...
Loading transcript for episode 429: transcripts\Naruhodo _429 - Qual o impacto das bets em nossas _g8lC8YEJRcQ_pt.txt
Chunked into 98 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 4/4 [00:05<00:00,  1.28s/it]


  Load:   0.15s
  Chunk:  0.08s
  Embed:  5.12s
  Import: 2.32s
  Total:  7.67s

Processing Episode 430...
Loading transcript for episode 430: transcripts\Naruhodo _430 - Por que é tão difícil deixar o ran_u0IesoD4A9A_pt.txt
Chunked into 99 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 4/4 [00:05<00:00,  1.46s/it]


  Load:   0.07s
  Chunk:  0.11s
  Embed:  5.87s
  Import: 2.76s
  Total:  8.81s

Processing Episode 431...
Loading transcript for episode 431: transcripts\Naruhodo _431 - Empreender se aprende ou é algo qu_Fsso3SDp650_pt.txt
Chunked into 89 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.68s/it]


  Load:   0.06s
  Chunk:  0.08s
  Embed:  5.07s
  Import: 2.58s
  Total:  7.79s

Processing Episode 432...
Loading transcript for episode 432: transcripts\Naruhodo _432 - O uso de cigarros eletrônicos é um_XCiHLKVWDVQ_pt.txt
Chunked into 93 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.79s/it]


  Load:   0.08s
  Chunk:  0.06s
  Embed:  5.40s
  Import: 2.28s
  Total:  7.83s

Processing Episode 433...
Loading transcript for episode 433: transcripts\Naruhodo _433 - Existe amizade entre homens e mulh_EFVaBfGaowg_pt.txt
Chunked into 79 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.46s/it]


  Load:   0.06s
  Chunk:  0.03s
  Embed:  4.39s
  Import: 1.80s
  Total:  6.28s

Processing Episode 434...
Loading transcript for episode 434: transcripts\Naruhodo _434 - Existe amizade entre homens e mulh_H6D1yCni0rc_pt.txt
Chunked into 96 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.87s/it]


  Load:   0.06s
  Chunk:  0.04s
  Embed:  5.62s
  Import: 2.31s
  Total:  8.03s

Processing Episode 435...
Loading transcript for episode 435: transcripts\Naruhodo _435 - Jogar videogame pode ajudar a cura_7Ob___Y97d4_pt.txt
Chunked into 94 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.88s/it]


  Load:   0.07s
  Chunk:  0.10s
  Embed:  5.64s
  Import: 2.16s
  Total:  7.97s

Processing Episode 436...
Loading transcript for episode 436: transcripts\Naruhodo _436 - A violência faz parte da _natureza_AnjRI8sfTJQ_pt.txt
Chunked into 86 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.60s/it]


  Load:   0.08s
  Chunk:  0.08s
  Embed:  4.80s
  Import: 2.38s
  Total:  7.34s

Processing Episode 437...
Loading transcript for episode 437: transcripts\Naruhodo _437 - O termo _macho alfa_ faz sentido_ _Qx1z1R_He_c_pt.txt
Chunked into 82 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:04<00:00,  1.40s/it]


  Load:   0.06s
  Chunk:  0.07s
  Embed:  4.21s
  Import: 2.59s
  Total:  6.94s

Processing Episode 438...
Loading transcript for episode 438: transcripts\Naruhodo _438 - O termo _macho alfa_ faz sentido_ _UNKh0Zd3h_k_pt.txt
Chunked into 91 chunks (size=300, overlap=100)
First 3 chunk lengths (tokens): [300, 300, 300]


Batches: 100%|██████████| 3/3 [00:05<00:00,  1.67s/it]


  Load:   0.07s
  Chunk:  0.08s
  Embed:  5.03s
  Import: 2.66s
  Total:  7.83s

=== All Episode Timing Summary ===
Episode 1: 1.37s (Load: 0.06s, Chunk: 0.03s, Embed: 0.88s, Import: 0.40s)
Episode 2: 0.93s (Load: 0.08s, Chunk: 0.01s, Embed: 0.77s, Import: 0.07s)
Episode 3: 0.54s (Load: 0.07s, Chunk: 0.00s, Embed: 0.42s, Import: 0.05s)
Episode 4: 1.36s (Load: 0.04s, Chunk: 0.02s, Embed: 1.08s, Import: 0.22s)
Episode 5: 1.14s (Load: 0.08s, Chunk: 0.01s, Embed: 0.93s, Import: 0.13s)
Episode 6: 0.96s (Load: 0.06s, Chunk: 0.01s, Embed: 0.81s, Import: 0.08s)
Episode 8: 1.70s (Load: 0.04s, Chunk: 0.02s, Embed: 1.46s, Import: 0.17s)
Episode 9: 0.34s (Load: 0.05s, Chunk: 0.00s, Embed: 0.22s, Import: 0.07s)
Episode 10: 0.94s (Load: 0.07s, Chunk: 0.01s, Embed: 0.75s, Import: 0.11s)
Episode 11: 0.59s (Load: 0.05s, Chunk: 0.00s, Embed: 0.47s, Import: 0.06s)
Episode 12: 0.55s (Load: 0.04s, Chunk: 0.00s, Embed: 0.44s, Import: 0.07s)
Episode 13: 1.35s (Load: 0.05s, Chunk: 0.02s, Embed: 1.16s, Import:

<a name="analytical-possibilities-in-neo4j"></a>
## Analytical Possibilities in Neo4j
Once your data is imported into Neo4j, there are numerous analyses you can perform, including:

- **Cluster Analysis:**
Identify clusters or communities of episodes that share many common references, which might indicate similar themes or topics.

- **Centrality Measures:**
Calculate metrics like degree centrality to identify which episodes or references are the most influential or central in the network.

- **Path Analysis:**
Trace paths between episodes to understand how scientific ideas or themes evolve and interconnect across different episodes.

- **Thematic Mapping:**
Explore how different subjects or areas of science intersect by analyzing shared references among episodes.

- **Content Recommendations:**
Build recommendation systems that suggest related episodes based on shared references or thematic similarities.


<a name="conclusion"></a>
## Conclusion
This project showcases a complete pipeline for extracting, normalizing, and importing podcast episode data into a Neo4j graph database. By leveraging environment configuration, data normalization, and robust Neo4j import techniques, you can reveal the hidden connections between episodes and references. This approach not only demonstrates technical proficiency in Python and graph databases but also opens up many avenues for sophisticated data analysis—making it an excellent addition to your portfolio.

Feel free to explore the code and extend its functionality. Happy coding and graph exploring!