# Convert Gambar Nota jadi Excel

## Cara Penggunaan

1. Klik ikon **Files** (gambar folder) di sebelah kiri untuk menampilkan daftar file dan folder
2. Klik ikon **upload** (gambar kertas dengan panah ke atas)
3. Pilih file Nota/Faktur (format: PNG, JPG, JPEG, WEBP, HEIC, HEIF), boleh lebih dari 1 file. Jika muncul **Warning**, klik **OK**
4. Isi **API_KEY = ''** dengan kode dari https://aistudio.google.com/apikey, misalnya jadi **API_KEY = 'ABCDEFG123'** (Hanya ganti bagian yang diapit tanda petik ')
5. Klik menu **Runtime > Run All** di bagian atas. Jika muncul **Warning**, klik **Run anyway**
6. Tunggu beberapa saat sampai semua cell (kotak tempat kode ditulis) memiliki centang hijau di samping kirinya
7. Klik ikon **refresh** (gambar panah bulat di sebelah ikon upload)
8. Pilih file **nota.xlsx**, klik ikon **tiga titik** di kanan file tersebut
9. Pilih **Download**

## Ubah Bagian Ini

In [None]:
API_KEY = 'AIzaSyDBhCAMcISchXzLzkyWN3uI_ZvNKBDEP6Q'

## Bagian di Bawah ini Tidak Perlu Diubah

In [None]:
PROMPT = '''
Ini adalah nota pembelian, ambil data dan tampilkan hasilnya dalam bentuk JSON.

Berikut kolom yang harus diisi
- tanggal dan waktu transaksi dalam format %d/%m/%Y %H:%M:%S (Jika tidak ada %d/%m/%Y, gunakan 01/01/1970. Jika tidak ada %H:%M:%S, gunakan 00:00:00) [nama key JSON: waktu]
- nama penjual (jika tidak ada, isi dg -) [nama key JSON: penjual]
- nama barang (jika tidak ada, isi dg -) [nama key JSON: barang]
- subtotal (pada gambar biasanya berada pada kolom paling kanan. merupakan hasil perkalian antara harga satuan barang x jumlah barang) [nama key JSON: subtotal]
- jumlah barang (jika tidak ada, isi dg 1. biasanya berada pada kolom paling kiri) [nama key JSON: jumlah]
- harga satuan barang (jika tidak ada, isi dg sub_total. jika tidak ditemukan, hitung berdasarkan kolom subtotal/jumlah) [nama key JSON: harga]
- service (jika tidak isi dg 0) [nama key JSON: service]
- pajak (jika tidak isi dg 0) [nama key JSON: pajak]
- pajak pertambahan nilai (PPN) (jika tidak isi dg 0) [nama key JSON: ppn]
'''

In [None]:
from google import genai
from google.genai import types
from pydantic import BaseModel
from tqdm.notebook import tqdm

import io
import os
import json
import pandas as pd

client = genai.Client(api_key=API_KEY)

class Invoice(BaseModel):
    waktu: str
    penjual: str
    barang: str
    harga: float
    jumlah: float
    service: float
    pajak: float
    ppn: float
    subtotal: float

In [None]:
def convert_image_to_data(filename):
    with open(filename, 'rb') as f:
        image_bytes = f.read()

    response = client.models.generate_content(
        model='gemini-2.0-flash-lite',
        config=types.GenerateContentConfig(
            temperature=0.4,
            response_mime_type='application/json',
            response_schema=list[Invoice]
        ),
        contents=[
            types.Part.from_bytes(
                data=image_bytes,
                mime_type=mime_type
            ),
            PROMPT]
    )

    my_data: list[Invoice] = response.parsed
    temp_df = pd.DataFrame([item.model_dump() for item in my_data])

    if temp_df.shape[0] > 0:
        temp_df['waktu'] = pd.to_datetime(temp_df['waktu'], format='%d/%m/%Y %H:%M:%S')
        return temp_df
    else:
        return None

In [None]:
def get_file_extension(filename):
  """
  Gets the file extension and checks if it's one of the allowed image types.

  Args:
    filename: The name of the file.

  Returns:
    The file extension (lowercase) if it's an allowed image type, otherwise None.
  """
  allowed_extensions = ['png', 'jpeg', 'jpg', 'webp', 'heic', 'heif']
  _, file_extension = os.path.splitext(filename)
  if file_extension:
    file_extension = file_extension[1:].lower()  # Remove the leading dot and convert to lowercase
    if file_extension in allowed_extensions:
        if file_extension == 'png': mime_type = 'image/png'
        elif file_extension == 'webp': mime_type = 'image/webp'
        elif file_extension == 'heic': mime_type = 'image/heif'
        elif file_extension == 'heif': mime_type = 'image/heif'
        else: mime_type = 'image/jpeg'

        return [file_extension, mime_type]
  return [None, None]


In [None]:
folder_path = '.'
all_files = os.listdir(folder_path)
image_files = [f for f in all_files if get_file_extension(f)[0] is not None]
df = pd.DataFrame()

for filename in tqdm(image_files, desc="Processing images"):
  file_extension, mime_type = get_file_extension(filename)

  if file_extension:
    file_path = os.path.join(folder_path, filename)
    temp_data = convert_image_to_data(file_path)

    if temp_data is not None:
        temp_data['filename'] = filename
        if 'df' in locals(): df = pd.concat([df, temp_data], ignore_index=True)
        else: df = temp_data

Processing images:   0%|          | 0/12 [00:00<?, ?it/s]

In [None]:
df

Unnamed: 0,waktu,penjual,barang,harga,jumlah,service,pajak,ppn,subtotal,filename
0,2022-10-28 00:00:00,""" SURYA ALAM """,Paku rivet 3.2,350.0,200.0,0.0,0.0,0.0,70000.0,nota5.jpg
1,2022-10-28 00:00:00,""" SURYA ALAM """,plat Alum 1x2m,780000.0,6.0,0.0,0.0,0.0,4680000.0,nota5.jpg
2,2022-10-28 00:00:00,""" SURYA ALAM """,Cat tembokyky,85000.0,1.0,0.0,0.0,0.0,85000.0,nota5.jpg
3,1970-01-01 00:00:00,ERMIE ADVERTISING AND PRINTING,Flyer Rumen,310.0,1.0,0.0,0.0,0.0,310.0,nota1.jpg
4,1970-01-01 00:00:00,ERMIE ADVERTISING AND PRINTING,B.Covel matt Canvase,55.0,100.0,0.0,0.0,0.0,55.0,nota1.jpg
...,...,...,...,...,...,...,...,...,...,...
57,1970-01-01 00:00:00,WANA WISATA & RESTO,kupi tubruh,6.0,4.0,0.0,0.0,0.0,24.0,nota4.jpg
58,1970-01-01 00:00:00,WANA WISATA & RESTO,Kopi T.,6.0,2.0,0.0,0.0,0.0,12.0,nota4.jpg
59,1970-01-01 00:00:00,WANA WISATA & RESTO,kopi Tubruh,6000.0,3.0,0.0,0.0,0.0,18000.0,nota4.jpg
60,1970-01-01 00:00:00,WANA WISATA & RESTO,NOSI goreng,18000.0,3.0,0.0,0.0,0.0,54000.0,nota4.jpg


In [None]:
if df.shape[0] > 0:
    df.to_excel('nota.xlsx', index=False)