# Smartphones and Watches advisor
In this serie of notebooks we will:
- **Load text from a Youtube video playlist**
- Split loaded text into appropriate chunks
- Create a vector store to load chunk embeddings
- Create a Question Answering chatbot:

    - with different prompts techniques
    - with different similarity measures
    - with or without chat memory 

!['Data Connection'](../../images/les_numeriques_youtube/data_connection.png)

# How to create a Text Loader from a Youtube video playlist ? 

From a playlist link 
- Get all the urls in the playlist
- Apply the speech to Text Whisper module
- Create a data loader

In [1]:
# Imports 
# Youtube
import time
from pytube import YouTube, Playlist

# Env var
import os 
import sys
from dotenv import load_dotenv, find_dotenv

# Langchain
import openai
from langchain.document_loaders.generic import GenericLoader
from langchain.document_loaders.parsers import OpenAIWhisperParser
from langchain.document_loaders.blob_loaders.youtube_audio import YoutubeAudioLoader

# Save loader
from utils import save_docs_to_jsonl

In [None]:
# Les num√©riques - Smartphones et mont√©es connect√©es
PLAYLIST_LINK = "https://www.youtube.com/playlist?list=PLhqibKns24ALiMaHTfEedcon0LzEAxu7q"
PATH_NAME_LOADER = './loaded_docs.jsonl'

In [None]:
# Video URLs from playlist
video_links = Playlist(PLAYLIST_LINK).video_urls
print(f"There are {len(video_links)} videos")

There are 302 videos


<div align="left">
      <a href="https://www.youtube.com/watch?v=PJ7qsF6LFSU">
         <img src="https://img.youtube.com/vi/PJ7qsF6LFSU/0.jpg" style="width:100%;">
      </a>
</div>

In [None]:
# Env variable
sys.path.append('../..')
_ = load_dotenv(find_dotenv())
openai.api_key = os.environ['OPENAI_API_KEY']

In [12]:
# Load Text from videos
save_dir="docs/youtube/"

loader = GenericLoader(
    YoutubeAudioLoader([video_links[:30]], save_dir),
    OpenAIWhisperParser()
)
docs = loader.load()

[youtube] Extracting URL: https://www.youtube.com/watch?v=MYX6xLYWJoU
[youtube] MYX6xLYWJoU: Downloading webpage
[youtube] MYX6xLYWJoU: Downloading ios player API JSON
[youtube] MYX6xLYWJoU: Downloading android player API JSON
[youtube] MYX6xLYWJoU: Downloading m3u8 information
[info] MYX6xLYWJoU: Downloading 1 format(s): 140
[download] docs/youtube//Stalk√© par un AirTag Ôºü.m4a has already been downloaded
[download] 100% of  947.70KiB
[ExtractAudio] Not converting audio docs/youtube//Stalk√© par un AirTag Ôºü.m4a; file is already in target format m4a
[youtube] Extracting URL: https://www.youtube.com/watch?v=THSjWVl80Qc
[youtube] THSjWVl80Qc: Downloading webpage
[youtube] THSjWVl80Qc: Downloading ios player API JSON
[youtube] THSjWVl80Qc: Downloading android player API JSON
[youtube] THSjWVl80Qc: Downloading m3u8 information
[info] THSjWVl80Qc: Downloading 1 format(s): 140
[download] docs/youtube//La nouvelle puce du Galaxy S23.m4a has already been downloaded
[download] 100% of    5.8

In [16]:
docs[5]

Document(page_content="Bonjour √† tous et bienvenue sur Les Num√©riques. Aujourd'hui, on va parler des Pixel 7 et Pixel 7 Pro, les nouveaux smartphones de Google. On commence par le premier, vendu 649 euros, le mod√®le standard reprend les m√™mes codes de design que les Pixel, √† savoir un hilo photo pro√©minent qui occupe toute la largeur du smartphone et met en avant les capteurs. Google a opt√© pour une finition en m√©tal, contrairement au vert du mod√®le pr√©c√©dent. C'est une esth√©tique qui avait fait d√©bat l'an dernier, m√™me si l'ensemble para√Æt ici un peu moins agressif, notamment gr√¢ce aux bords arrondis. A la r√©daction, nous, on aime plut√¥t bien, notamment dans les nouvelles couleurs, comme le vert citron. Alors, il faut savoir que le Pixel 7 a effectu√© une petite cure de minceur puisqu'il p√®se 197 grammes. Ce n'est pas forc√©ment l√©ger, mais c'est 10 de moins que le Pixel 6. Le dos conserve toujours cette finition en vert qui accroche d'ailleurs un peu trop les trac

In [17]:
save_docs_to_jsonl(docs, PATH_NAME_LOADER)