OpenAI API: Multimodal development with GPT-4o

This is the repository for the LinkedIn Learning course OpenAI API: Multimodal development with GPT-4o. The full course is available from LinkedIn Learning.

In this hands-on course you’ll use the OpenAI API to leverage the multimodal capabilities of GPT-4o and function calling to extract text from images, conform the data to JSON, and call functions to save the extracted data to a spreadsheet.

See the readme file in the main branch for updated instructions and information.

Instructions

This repository holds example data and two Jupyter Notebooks:

data/ holds a collection of images of random receipts and one wild-card.
expenses.csv is the target CSV. At init, the CSV only holds column headings.
gp4o-setup.py demonstrates how to how to access gpt-4o for multimodal prompting.
modular-process.py and the module files in utils/ demonstrate a comprehensive process of ingesting and interpreting multiple receipts and sending the data to a CSV file.

The first time you run a block in a Jupyter Notebook, the environment will ask you to pick an environment. Follow the instructions and pick the first available Python environment.

NOTE: The first code block may take a while to load because the environment has to load first.

Installing

It is recommended you run these exercise files in GitHub Codespaces. This gives you a pre-configured Python environment for the Jupyter Notebooks to run. To use the exercise files, follow these steps:

In the root folder, rename the file env-template to .env.
Go to https://platform.openai.com/api-keys.
Generate a new key and copy the key to your clipboard.
In .env add the key without quotes or parentheses.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
data		data
utils		utils
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
env-template		env-template
expenses.csv		expenses.csv
favicon.ico		favicon.ico
function_response_example.json		function_response_example.json
gpt4o-setup.ipynb		gpt4o-setup.ipynb
modular-process.ipynb		modular-process.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenAI API: Multimodal development with GPT-4o

Instructions

Installing

About

Uh oh!

Releases

Packages

Languages

License

s2005/open-ai-api-multimodal-development-with-gpt-4o

Folders and files

Latest commit

History

Repository files navigation

OpenAI API: Multimodal development with GPT-4o

Instructions

Installing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages