##### 版權所有 2024 Google LLC.


In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# 語音備忘錄


<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/doggy8088/gemini-api-cookbook/blob/zh-tw/examples/Voice_memos.zh.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />在 Google Colab 中執行</a>
  </td>
</table>


這個筆記本提供了一個如何同時處理聲音和文字檔案的快速範例中。你將使用 Gemini API 來協助你根據在手機上錄製的語音備忘錄和以前寫過的文章，為你的下一個部落格文章產生想法。


In [None]:
!pip install -U -q google-generativeai

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m150.7/150.7 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m677.8/677.8 kB[0m [31m9.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import google.generativeai as genai

### 設定你的 API 金鑰

如要執行以下單元格，你的 API 金鑰必須儲存在名為 `GOOGLE_API_KEY` 的 Colab Secret 中。如果你尚未擁有 API 金鑰，或不確定如何建立 Colab Secret，請參閱 [驗證](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb)取得範例。


In [None]:
from google.colab import userdata
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=GOOGLE_API_KEY)

安裝 PDF 處理工具。


In [None]:
!apt install poppler-utils

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  poppler-utils
0 upgraded, 1 newly installed, 0 to remove and 45 not upgraded.
Need to get 186 kB of archives.
After this operation, 696 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 poppler-utils amd64 22.02.0-2ubuntu0.4 [186 kB]
Fetched 186 kB in 1s (326 kB/s)
Selecting previously unselected package poppler-utils.
(Reading database ... 121918 files and directories currently installed.)
Preparing to unpack .../poppler-utils_22.02.0-2ubuntu0.4_amd64.deb ...
Unpacking poppler-utils (22.02.0-2ubuntu0.4) ...
Setting up poppler-utils (22.02.0-2ubuntu0.4) ...
Processing triggers for man-db (2.10.2-1) ...


## 上傳你的聲音和文字檔案


In [None]:
!wget https://storage.googleapis.com/generativeai-downloads/data/Walking_thoughts_3.m4a
!wget https://storage.googleapis.com/generativeai-downloads/data/A_Possible_Future_for_Online_Content.pdf
!wget https://storage.googleapis.com/generativeai-downloads/data/Unanswered_Questions_and_Endless_Possibilities.pdf

--2024-05-14 20:33:10--  https://storage.googleapis.com/generativeai-downloads/data/Walking_thoughts_3.m4a
Resolving storage.googleapis.com (storage.googleapis.com)... 108.177.98.207, 74.125.197.207, 74.125.135.207, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|108.177.98.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2060207 (2.0M) [audio/x-m4a]
Saving to: ‘Walking_thoughts_3.m4a’


2024-05-14 20:33:10 (119 MB/s) - ‘Walking_thoughts_3.m4a’ saved [2060207/2060207]

--2024-05-14 20:33:10--  https://storage.googleapis.com/generativeai-downloads/data/A_Possible_Future_for_Online_Content.pdf
Resolving storage.googleapis.com (storage.googleapis.com)... 108.177.98.207, 74.125.197.207, 74.125.135.207, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|108.177.98.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2798700 (2.7M) [application/pdf]
Saving to: ‘A_Possible_Future_for_Online_Content.pdf’


2

In [None]:
audio_file_name = "Walking_thoughts_3.m4a"
audio_file = genai.upload_file(path=audio_file_name)

## 從 PDF 中提取文字


In [None]:
!pdftotext A_Possible_Future_for_Online_Content.pdf
!pdftotext Unanswered_Questions_and_Endless_Possibilities.pdf

In [None]:
blog_file_name = "A_Possible_Future_for_Online_Content.txt"
blog_file = genai.upload_file(path=blog_file_name)

In [None]:
blog_file_name2 = "Unanswered_Questions_and_Endless_Possibilities.txt"
blog_file2 = genai.upload_file(path=blog_file_name2)

## 系統說明

撰寫詳細的系統說明，以設定該模型。


In [None]:
si="""Objective: Transform raw thoughts and ideas into polished, engaging blog posts that capture a writers unique style and voice.
Input:
Example Blog Posts (1-5): A user will provide examples of blog posts that resonate with their desired style and tone. These will guide you in understanding the preferences for word choice, sentence structure, and overall voice.
Audio Clips: A user will share a selection of brainstorming thoughts and key points through audio recordings. They will talk freely and openly, as if they were explaining their ideas to a friend.
Output:
Blog Post Draft: A well-structured first draft of the blog post, suitable for platforms like Substack or LinkedIn.
The draft will include:
Clear and engaging writing: you will strive to make the writing clear, concise, and interesting for the target audience.
Tone and style alignment: The language and style will closely match the examples provided, ensuring consistency with the desired voice.
Logical flow and structure: The draft will be organized with clear sections based on the content of the post.
Target word count: Aim for 500-800 words, but this can be adjusted based on user preferences.
Process:
Style Analysis: Carefully analyze the example blog posts provided by the user to identify key elements of their preferred style, including:
Vocabulary and word choice: Formal vs. informal, technical terms, slang, etc.
Sentence structure and length: Short and impactful vs. longer and descriptive sentences.
Tone and voice: Humorous, serious, informative, persuasive, etc.
Audio Transcription and Comprehension: Your audio clips will be transcribed with high accuracy. you will analyze them to extract key ideas, arguments, and supporting points.
Draft Generation: Using the insights from the audio and the style guidelines from the examples, you will generate a first draft of the blog post. This draft will include all relevant sections with supporting arguments or evidence, and a great ending that ties everything together and makes the reader want to invest in future readings.
"""

## 產生內容


In [None]:
prompt = "Draft my next blog post based on my thoughts in this audio file and these two previous blog posts I wrote."

model = genai.GenerativeModel(model_name="models/gemini-1.5-pro-latest", system_instruction=si)

response = model.generate_content([prompt, blog_file, blog_file2, audio_file],
                                  request_options={"timeout": 600})
print(response.text)

## The Right to Think: Reframing "Throwaway Work"

Early in my career, I spent a lot of time working on visions, roadmaps, and ideas. Some of them never materialized, leading to large projects being scrapped or canceled. Even today, I encounter this, as do people on my team, and even entire companies! Priorities change, markets shift, there are countless reasons for it. But I remember feeling so frustrated, coming straight out of school with this ingrained idea that you're given an assignment, you do it, you get graded, and that's it.

There was no concept of "throwaway work" in that academic setting. You were given a task, expected to produce, and then evaluated.  Entering the workforce, I struggled to reconcile this ingrained mentality with the reality that much of what I produced might not matter. Projects shifted, priorities changed, and things I poured time into could be discarded without a second thought. It felt like a monumental waste of time.

It took a while to get over it, b

## 進一步瞭解

* 透過 [File API](https://github.com/google-gemini/cookbook/blob/main/quickstarts/File_API.ipynb) 快速入門，進一步瞭解 File API。
