<a href="https://colab.research.google.com/github/jukeyman/Captura/blob/master/ai_query_youtube.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install youtube-transcript-api

In [None]:
from youtube_transcript_api import YouTubeTranscriptApi
from google.colab import userdata
import google.generativeai as genai
import json
import pprint
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY') #Set up Gemini Credentials, see video: https://www.youtube.com/watch?v=S1elvCs1gyI
genai.configure(api_key=GOOGLE_API_KEY)
model = genai.GenerativeModel("gemini-1.5-flash-latest")

In [None]:
def get_transcript(video_url):
  if "&" in video_url:
    video_url = video_url.split("&")[0]
  video_id = video_url.split("v=")[1]
  entries = []
  try:
      transcript = YouTubeTranscriptApi.get_transcript(video_id)
      for entry in transcript:
          entries.append(entry)
  except Exception as e:
      print("Error:", e)
  return entries

def get_transcripts(video_url_list):
  if isinstance(video_url_list, str):
    video_url_list = [video_url_list]
  transcripts = []
  for video_url in video_url_list:
    transcript = get_transcript(video_url)
    if len(transcript) > 0:
      transcripts.append({"Video": video_url, "Transcript": transcript})
  print(f"Got {len(transcripts)} transcripts")
  return transcripts

In [None]:
def query_youtubes(query, video_url_list):
  transcripts = get_transcripts(video_url_list)
  prompt = f"""

  prompt = f"""
    I'd like to analyze the following YouTube video transcript and create a comprehensive, actionable, and visually rich output.

    Here's the question/topic:
    ---
    {query}
    ---

    Here's the YouTube transcript:
    ---
    {formatted_transcripts}
    ---

    Output Requirements:

    1.  **Full Detailed Summary:** Provide a comprehensive, detailed summary of all content covered in the transcript. This should be a thorough overview of the entire video content, capturing all key points and nuances.

    2.  **Key Sections and Jewels:** Identify and list key sections, critical information, and 'jewels' (valuable insights) from the transcript. Structure this as a list of bullet points or numbered items, with each point explained in detail.

    3.  **Full Transcript:** Include the full, unedited transcript of the video.

    4.  **Structured E-book:** Create a structured e-book version of the content. This should include:
        *   A detailed table of contents with clickable links.
        *   Well-defined chapters and sub-sections with clear headings and subheadings.
        *   Use of markdown formatting for headings, lists, emphasis, and code blocks.
        *   Inclusion of relevant images generated by AI to illustrate key concepts.
        *   A professional and engaging writing style.

    5.  **Step-by-Step Guide:** Develop a detailed, actionable step-by-step guide on how to create any product, service, course, item, or download mentioned in the video. This should include:
        *   All necessary steps, materials, and instructions.
        *   Clear and concise language.
        *   Visual aids (images generated by AI) where appropriate.
        *   Troubleshooting tips and common pitfalls.

    6.  **Product Outline and Setup:** Provide a detailed outline and setup guide for any product, service, course, item, or download mentioned in the video. This should include:
        *   A structured plan for creation and implementation.
        *   A list of required resources and tools.
        *   A timeline for each stage of development.
        *   A detailed breakdown of costs and potential revenue streams.

    7.  **Structured Data Output:** Provide a structured data output (JSON) of all the key information, including:
        *   Key sections and jewels.
        *   Steps from the step-by-step guide.
        *   Product outline and setup details.
        *   This should be easily parsable for further use.

    8.  **SEO Optimized Blog:** Create an SEO-optimized blog post based on the combined information from the video. This should include:
        *   Relevant keywords.
        *   Compelling headings and subheadings.
        *   A meta description.
        *   Internal and external links.
        *   Images generated by AI to enhance engagement.

    9.  **Questions and Answers:** Generate a list of questions that could be asked based on the content of the video, and provide detailed answers to those questions using the transcript.

    10. **Knowledge Base:** Create a structured knowledge base from the video content, including:
        *   A hierarchical structure of topics and subtopics.
        *   Detailed explanations and definitions.
        *   Links to relevant sections of the e-book and step-by-step guide.
        *   Images generated by AI to illustrate key concepts.

    11. **Agent Creation Instructions:** If the video mentions creating agents, provide detailed instructions on how to create these agents, including:
        *   The agent's purpose and capabilities.
        *   The tools and resources the agent needs.
        *   The prompts and instructions for the agent.
        *   Examples of how to use the agent.

    Additional Instructions:

    *   Do not download data from the internet. Use only the provided transcript.
    *   Ensure the output is college-level, structured, and written in a corporate style.
    *   Use clear and concise language.
    *   Provide actionable insights and practical steps.
    *   Use markdown formatting for readability.
    *   Generate images using AI where needed to enhance the content.
  """
  response = model.generate_content(prompt)
  return response.text

In [None]:
videos= [
    "https://www.youtube.com/watch?v=y3Umo_jd5AA&pp=ygUXQ2FsIE5ld3BvcnQgbGV4IGZyaWRtYW4%3D",
    "https://www.youtube.com/watch?v=atuEOcRznpA&pp=ygUWQ2FsIE5ld3BvcnQgdGltIGZlcnJpcw%3D%3D",
    "https://www.youtube.com/watch?v=z6IgPEO2jAk&pp=ygUcQ2FsIE5ld3BvcnQgam9yZGFuIGhhcmJpbmdlcg%3D%3D",

]

query = "i need full detailed analysis and all content to be deliovered in out puts above and also a fully code and detailed step by step set up of what was in each video and include full plan etc "

pprint.pprint(query_youtubes(query, videos))

In [None]:
# @title
from google.colab import files
import json
transcripts = get_transcripts(videos)
fname = "transcripts.jsonl"
json.dump(transcripts, open(fname, "w"))
files.download(fname)