##### Copyright 2025 Google LLC.

In [16]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Voice memos

This notebook provides a quick example of how to work with audio and text files in the same prompt. You'll use the Gemini API to help you generate ideas for your next blog post, based on voice memos you recorded on your phone, and previous articles you've written.

In [None]:
%pip install -qU "google-genai>=1.0.0"


### Setup your API key

To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) for an example.

In [13]:
from google import genai
from google.genai import types
from IPython.display import Markdown


In [None]:
client = genai.Client(api_key="AIzaSyDcThuS6ZnHokZFtIlQpkDRaoxjyTxcx9w")

Install PDF processing tools.

In [6]:
!apt install poppler-utils


Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  poppler-utils
0 upgraded, 1 newly installed, 0 to remove and 29 not upgraded.
Need to get 186 kB of archives.
After this operation, 696 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 poppler-utils amd64 22.02.0-2ubuntu0.6 [186 kB]
Fetched 186 kB in 1s (125 kB/s)
Selecting previously unselected package poppler-utils.
(Reading database ... 126209 files and directories currently installed.)
Preparing to unpack .../poppler-utils_22.02.0-2ubuntu0.6_amd64.deb ...
Unpacking poppler-utils (22.02.0-2ubuntu0.6) ...
Setting up poppler-utils (22.02.0-2ubuntu0.6) ...
Processing triggers for man-db (2.10.2-1) ...


## Upload your audio and text files


In [3]:
!wget https://storage.googleapis.com/generativeai-downloads/data/Walking_thoughts_3.m4a
!wget https://storage.googleapis.com/generativeai-downloads/data/A_Possible_Future_for_Online_Content.pdf
!wget https://storage.googleapis.com/generativeai-downloads/data/Unanswered_Questions_and_Endless_Possibilities.pdf


--2025-03-22 10:24:19--  https://storage.googleapis.com/generativeai-downloads/data/Walking_thoughts_3.m4a
Resolving storage.googleapis.com (storage.googleapis.com)... 142.250.157.207, 142.251.8.207, 142.251.170.207, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.250.157.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2060207 (2.0M) [audio/x-m4a]
Saving to: ‘Walking_thoughts_3.m4a’


2025-03-22 10:24:21 (2.34 MB/s) - ‘Walking_thoughts_3.m4a’ saved [2060207/2060207]

--2025-03-22 10:24:21--  https://storage.googleapis.com/generativeai-downloads/data/A_Possible_Future_for_Online_Content.pdf
Resolving storage.googleapis.com (storage.googleapis.com)... 142.250.157.207, 142.251.8.207, 142.251.170.207, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.250.157.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2798700 (2.7M) [application/pdf]
Saving to: ‘A_Possible_Future_for_Online_Content.pdf

In [5]:
audio_file_name = "Walking_thoughts_3.m4a"
audio_file = client.files.upload(file=audio_file_name)

## Extract text from the PDFs

In [7]:
!pdftotext A_Possible_Future_for_Online_Content.pdf
!pdftotext Unanswered_Questions_and_Endless_Possibilities.pdf

In [9]:
blog_file_name = "A_Possible_Future_for_Online_Content.txt"
blog_file = client.files.upload(file=blog_file_name)

In [10]:
blog_file_name2 = "Unanswered_Questions_and_Endless_Possibilities.txt"
blog_file2 = client.files.upload(file=blog_file_name2)

## System instructions

Write a detailed system instruction to configure the model.

In [11]:
si="""Objective: Transform raw thoughts and ideas into polished, engaging blog posts that capture a writers unique style and voice.
Input:
Example Blog Posts (1-5): A user will provide examples of blog posts that resonate with their desired style and tone. These will guide you in understanding the preferences for word choice, sentence structure, and overall voice.
Audio Clips: A user will share a selection of brainstorming thoughts and key points through audio recordings. They will talk freely and openly, as if they were explaining their ideas to a friend.
Output:
Blog Post Draft: A well-structured first draft of the blog post, suitable for platforms like Substack or LinkedIn.
The draft will include:
Clear and engaging writing: you will strive to make the writing clear, concise, and interesting for the target audience.
Tone and style alignment: The language and style will closely match the examples provided, ensuring consistency with the desired voice.
Logical flow and structure: The draft will be organized with clear sections based on the content of the post.
Target word count: Aim for 500-800 words, but this can be adjusted based on user preferences.
Process:
Style Analysis: Carefully analyze the example blog posts provided by the user to identify key elements of their preferred style, including:
Vocabulary and word choice: Formal vs. informal, technical terms, slang, etc.
Sentence structure and length: Short and impactful vs. longer and descriptive sentences.
Tone and voice: Humorous, serious, informative, persuasive, etc.
Audio Transcription and Comprehension: Your audio clips will be transcribed with high accuracy. you will analyze them to extract key ideas, arguments, and supporting points.
Draft Generation: Using the insights from the audio and the style guidelines from the examples, you will generate a first draft of the blog post. This draft will include all relevant sections with supporting arguments or evidence, and a great ending that ties everything together and makes the reader want to invest in future readings.
"""

## Generate Content

In [12]:
prompt = "Draft my next blog post based on my thoughts in this audio file and these two previous blog posts I wrote."


In [14]:
response = client.models.generate_content(
  model='gemini-2.0-flash',
  contents=[prompt, blog_file, blog_file2, audio_file ],
  config=types.GenerateContentConfig(
      system_instruction=si
   ),
)

In [15]:
Markdown(response.text)

Okay, here is a draft blog post based on your audio clip and the previous blog posts you provided. I focused on capturing your conversational tone and incorporating the key points you emphasized, as well as echoing the style of your previous writing.

## Writing to Think: Why "Throwaway Work" is Actually Essential

Early in my career, I spent a ton of time crafting visions, roadmaps, and big ideas. Some of them ended up on the scrap heap – entire projects canceled, resources reallocated. It still happens today, both to me and my team. Priorities change, markets shift, and sometimes, you just have to pivot.

I remember being incredibly frustrated by this early on. I came straight from school with this ingrained idea that you get an assignment, and you *do* it. No take-backs, no do-overs. Here’s the content we need; now produce it, and you're graded on the final result.

The workforce is WILDLY different.

It was hard to reconcile that drive to produce with the reality that *none of it might matter*. None of it might be used, none of it might go anywhere. It felt like a monumental waste of time.

It took me a while to get over it. I didn't truly appreciate it until I joined my current team, where we have this culture of "right to think." All of a sudden, it clicked. It's not simply that priorities change, and that's just part of the game. The act of producing, the content you create, is part of the process of making you better at what you do *in the future*. It's honing your skills.

This shifted how I thought about things.

So how do you extend that thinking to all aspects of your work? How do you speed up that learning and growing? That's where this culture of "write to think" comes into play.

Write more. You might still do the same amount, but write more, write earlier, write often. Write *without* the intention of it needing to be a final product. Be willing to scrap it and move on once it’s served its purpose.

The idea of "write to think" ties in nicely to iterative processes and doing-by-learning. But it’s such a great framing, and something I’m constantly talking about. It calls out the importance of writing to think and reframes young Jaclyn's mindset around "throwaway work," because in this model, there's no such thing.

It's all helping you hone your skills and get better over time.

In this new light, there is no such thing as throwaway work- its all helping you hone your skills and get better over time. Each attempt, each project, each canceled assignment, helps you approach the next with a higher level of refined skills.
