Skip to content

Export Google timeline from a Json file to csv/database inserts

Notifications You must be signed in to change notification settings

thiago-glaser/Process-Google-Timeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GPS Timeline to Oracle Database

A Python-based tool to extract GPS coordinates from Android Timeline JSON backups and generate Oracle SQL INSERT statements for database loading.

Overview

This project processes Android Timeline location backup data and converts it into Oracle SQL statements for bulk insertion into a GPS_COORDINATES database table. The tool extracts position data from both raw signals and semantic timeline segments, deduplicates records, and generates production-ready SQL scripts with transaction management.

Features

  • Dual Source Extraction: Extracts GPS points from both rawSignals (position data) and semanticSegments (timeline paths)
  • Timezone Conversion: Automatically converts all timestamps to UTC
  • Deduplication: Removes duplicate coordinate entries while preserving chronological order
  • Flexible SQL Generation: Creates individual INSERT statements with per-record commits or batch INSERT statements
  • Oracle Compatible: Generates SQL with proper TO_TIMESTAMP_TZ formatting for Oracle databases
  • Transaction Management: Includes COMMIT statements for data integrity

Files

Source Scripts

File Purpose
extract_gps.py Extracts GPS coordinates from Timeline.json and generates coordinates.csv
generate_inserts.py Converts coordinates.csv into Oracle SQL INSERT statements

Data Files

File Purpose
coordinates.csv Extracted GPS coordinates in CSV format (generated by running the script)
create_table.sql Oracle table schema definition

Generated SQL Files

File Purpose Use Case
insert_coordinates.sql Individual INSERT statements Small to medium datasets, single-record commits
insert_coordinates_batch.sql Batch INSERT statements Large datasets, better performance

Prerequisites

  • Python 3.7 or later
  • Oracle SQL*Plus client (for executing SQL files)
  • Access to an Oracle database with sufficient privileges
  • Timeline.json from Android Timeline backup

Workflow

Step 1: Extract GPS Coordinates

python extract_gps.py Timeline.json

Input: Timeline.json Output: coordinates.csv

This script:

  • Reads all raw signals (position data with timestamps)
  • Reads all semantic segments (timeline paths)
  • Parses latitude/longitude from coordinate strings
  • Converts all timestamps to UTC (ISO 8601 format)
  • Removes duplicate entries
  • Sorts by datetime

Example Output:

datetime,latitude,longitude,altitude
2023-06-15T09:30:00Z,40.7128,-74.0060,12.5
2023-06-15T10:15:00Z,40.7150,-74.0080,15.2
2023-06-15T11:45:00Z,40.7080,-73.9950,8.3
2023-06-15T14:20:00Z,40.7580,-73.9855,22.1
2023-06-15T16:05:00Z,40.7489,-73.9680,18.7

Step 2: Generate SQL Statements

python generate_inserts.py coordinates.csv

Input: coordinates.csv Output:

  • insert_coordinates.sql (individual INSERT statements)
  • insert_coordinates_batch.sql (batch INSERT statements)

This script:

  • Reads the coordinates CSV
  • Generates Oracle-compatible INSERT statements
  • Includes proper timestamp formatting with TO_TIMESTAMP_TZ
  • Wraps statements in BEGIN/END transaction block
  • Adds COMMIT statements for data persistence

Step 3: Create Oracle Table

sqlplus user/password@database @create_table.sql

Creates the GPS_COORDINATES table with the following structure:

CREATE TABLE GPS_COORDINATES (
    id NUMBER PRIMARY KEY,
    datetime_utc TIMESTAMP WITH TIME ZONE,
    latitude NUMBER(9,7),
    longitude NUMBER(10,7),
    altitude_meters NUMBER
);

Step 4: Load Data into Oracle

Option A: Individual Inserts (Safer for small datasets)

sqlplus user/password@database @insert_coordinates.sql

Characteristics:

  • Each INSERT is committed individually
  • Separate INSERT statements for each record
  • Slower execution
  • Better for partial recoveries

Option B: Batch Inserts (Faster)

sqlplus user/password@database @insert_coordinates_batch.sql

Characteristics:

  • Batch INSERT statements with multiple records per batch
  • Faster execution
  • Better resource utilization
  • Recommended for production loading

Coordinate Data

Processes GPS coordinates from Android Timeline backups. Data is deduplicated and chronologically sorted.

SQL Format Examples

Individual INSERT (insert_coordinates.sql)

BEGIN
  INSERT INTO GPS_COORDINATES (datetime_utc, latitude, longitude, altitude_meters) 
    VALUES (TO_TIMESTAMP_TZ('2023-06-15T09:30:00+00:00', 'YYYY-MM-DD"T"HH24:MI:SSTZH:TZM'), 
            40.7128, -74.0060, 12.5);
  COMMIT; -- Committed record 1
  INSERT INTO GPS_COORDINATES (datetime_utc, latitude, longitude, altitude_meters) 
    VALUES (TO_TIMESTAMP_TZ('2023-06-15T10:15:00+00:00', 'YYYY-MM-DD"T"HH24:MI:SSTZH:TZM'), 
            40.7150, -74.0080, 15.2);
  COMMIT; -- Committed record 2
  ...
END;
/

Batch INSERT (insert_coordinates_batch.sql)

-- Batch 1 (1000 records)
INSERT INTO GPS_COORDINATES (datetime_utc, latitude, longitude, altitude_meters)
SELECT * FROM (
  SELECT TO_TIMESTAMP_TZ(col1, 'YYYY-MM-DD"T"HH24:MI:SSTZH:TZM'), col2, col3, col4 FROM (
    VALUES 
      ('2023-06-15T09:30:00+00:00', 40.7128, -74.0060, 12.5),
      ('2023-06-15T10:15:00+00:00', 40.7150, -74.0080, 15.2),
      ('2023-06-15T11:45:00+00:00', 40.7080, -73.9950, 8.3),
      ...
  )
);
COMMIT;

Timestamp Format

All timestamps use the following Oracle format:

TO_TIMESTAMP_TZ('YYYY-MM-DDTHH24:MI:SS+HH:MM', 'YYYY-MM-DD"T"HH24:MI:SSTZH:TZM')

Example: 2013-02-08T14:38:00+00:00 (ISO 8601 UTC)

Error Handling

Common Issues

ORA-01821: Date Format Not Recognized

  • Cause: Timestamp format mismatch
  • Solution: Ensure SQL was generated with the latest generate_inserts.py

ORA-00904: Invalid Column Name

  • Cause: Table schema doesn't match INSERT statement
  • Solution: Run create_table.sql first to create the table

File Not Found

  • Cause: Missing Timeline.json or coordinates.csv
  • Solution: Ensure files are in the working directory

Verifying Data Load

After running the SQL script, verify the record count:

SELECT COUNT(*) as total_records FROM GPS_COORDINATES;

Performance Notes

Method Time Records/Sec Use Case
Individual Inserts Single record commits Development
Batch Inserts Multiple records per batch Production

Times vary based on:

  • Network latency to database
  • Oracle server load
  • Disk I/O performance
  • Commit frequency

Files Generated

ExtractPointsFromMapsBackup/
├── extract_gps.py                       (Extract coordinates script)
├── generate_inserts.py                  (Generate SQL script)
├── create_table.sql                     (Table schema)
├── coordinates.csv                      (GENERATED: extracted GPS coordinates)
├── insert_coordinates.sql               (GENERATED: individual INSERTs)
├── insert_coordinates_batch.sql         (GENERATED: batch INSERTs)
└── README.md                            (This file)

Usage Example - Complete Workflow

# 1. Extract coordinates from Timeline.json
python extract_gps.py Timeline.json
# Output: coordinates.csv

# 2. Generate SQL statements
python generate_inserts.py coordinates.csv
# Output: insert_coordinates.sql, insert_coordinates_batch.sql

# 3. Create table in Oracle
sqlplus user/password@mydb @create_table.sql

# 4. Load data using batch inserts (faster)
sqlplus user/password@mydb @insert_coordinates_batch.sql

# 5. Verify the load
sqlplus user/password@mydb
SQL> SELECT COUNT(*) as total_records FROM GPS_COORDINATES;

Database Compatibility

While this tool is optimized for Oracle databases, it can be adapted to work with other database systems with minimal modifications. The main changes involve adjusting SQL syntax and timestamp formatting in generate_inserts.py.

Supported Databases

PostgreSQL

Modify the timestamp format in generate_inserts.py:

# Current Oracle format
datetime_utc_converted = datetime_utc.replace('Z', '+00:00')
insert_stmt = f"INSERT INTO GPS_COORDINATES ... TO_TIMESTAMP_TZ('{datetime_utc_converted}', ...)"

# PostgreSQL format
insert_stmt = (
    f"INSERT INTO GPS_COORDINATES (datetime_utc, latitude, longitude, altitude_meters) "
    f"VALUES ('{datetime_utc}'::timestamptz, {latitude}, {longitude}, {altitude_clause});"
)

Update create_table.sql:

CREATE TABLE GPS_COORDINATES (
    id SERIAL PRIMARY KEY,
    datetime_utc TIMESTAMP WITH TIME ZONE,
    latitude DECIMAL(9,7),
    longitude DECIMAL(10,7),
    altitude_meters DECIMAL(10,2)
);

MySQL/MariaDB

Modify the timestamp format in generate_inserts.py:

# MySQL format
datetime_utc_formatted = datetime_utc.replace('Z', '')
insert_stmt = (
    f"INSERT INTO GPS_COORDINATES (datetime_utc, latitude, longitude, altitude_meters) "
    f"VALUES ('{datetime_utc_formatted}', {latitude}, {longitude}, {altitude_clause});"
)

Update create_table.sql:

CREATE TABLE GPS_COORDINATES (
    id INT AUTO_INCREMENT PRIMARY KEY,
    datetime_utc TIMESTAMP(6),
    latitude DECIMAL(9,7),
    longitude DECIMAL(10,7),
    altitude_meters DECIMAL(10,2),
    INDEX idx_datetime (datetime_utc)
);

SQL Server

Modify the timestamp format in generate_inserts.py:

# SQL Server format
datetime_utc_formatted = datetime_utc.replace('Z', '')
insert_stmt = (
    f"INSERT INTO GPS_COORDINATES (datetime_utc, latitude, longitude, altitude_meters) "
    f"VALUES (CAST('{datetime_utc_formatted}' AS DATETIME2), {latitude}, {longitude}, {altitude_clause});"
)

Update create_table.sql:

CREATE TABLE GPS_COORDINATES (
    id INT IDENTITY(1,1) PRIMARY KEY,
    datetime_utc DATETIME2(6),
    latitude DECIMAL(9,7),
    longitude DECIMAL(10,7),
    altitude_meters DECIMAL(10,2)
);

SQLite

Modify the timestamp format in generate_inserts.py:

# SQLite format (uses ISO 8601 strings)
insert_stmt = (
    f"INSERT INTO GPS_COORDINATES (datetime_utc, latitude, longitude, altitude_meters) "
    f"VALUES ('{datetime_utc}', {latitude}, {longitude}, {altitude_clause});"
)

Update create_table.sql:

CREATE TABLE GPS_COORDINATES (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    datetime_utc TEXT,
    latitude REAL,
    longitude REAL,
    altitude_meters REAL
);

Key Customization Points

  1. Timestamp Format: Different databases have different timestamp representations
  2. Data Types: Adjust DECIMAL, NUMBER, REAL, etc. based on database requirements
  3. Primary Key: Some databases use SERIAL/AUTO_INCREMENT instead of NUMBER
  4. Transaction Syntax: BEGIN/END blocks may differ (e.g., MySQL uses START TRANSACTION)
  5. Escape Characters: Quote styles vary across databases

Steps to Adapt for a New Database

  1. Create a backup of generate_inserts.py
  2. Modify the INSERT statement generation in the generate_inserts() function
  3. Update create_table.sql with proper data types and constraints
  4. Test with a small subset of data first
  5. Adjust commit frequency and batch size based on database performance

Customization

Modifying Commit Frequency

To change commit frequency in insert_coordinates.sql, edit generate_inserts.py:

# Current: commits after every record
if idx % 1 == 0:  # Change to idx % 100 for commits every 100 records
    f.write(f"  COMMIT;\n")

Changing Batch Size

To change batch size in insert_coordinates_batch.sql, edit generate_inserts.py:

generate_batch_inserts(csv_path, output_batch, batch_size=1000)  # Change 1000 to desired batch size

Filtering by Date Range

Edit extract_gps.py to filter coordinates by date:

# Add date filtering after extracting coordinates
from datetime import datetime
start_date = datetime.fromisoformat('2020-01-01T00:00:00Z')
end_date = datetime.fromisoformat('2021-12-31T23:59:59Z')

coordinates = [c for c in coordinates 
               if start_date <= datetime.fromisoformat(c['datetime']) <= end_date]

Troubleshooting

No coordinates extracted

  • Check: Timeline.json file is valid JSON
  • Check: File contains rawSignals and/or semanticSegments arrays
  • Command: python -c "import json; json.load(open('Timeline.json'))"

Fewer records than expected

  • Check: Duplicates are being removed (expected behavior)
  • Check: Invalid coordinates are being filtered out
  • Verify: Run python extract_gps.py Timeline.json to see progress

SQL execution too slow

  • Use: Batch INSERT instead of individual INSERTs
  • Check: Network latency to database server
  • Consider: Disabling indexes during bulk load, rebuilding after

Timestamp conversion errors

  • Verify: All timestamps in coordinates.csv are in ISO 8601 format
  • Check: Timezone offset is included (+00:00 for UTC)

Requirements

  • Python 3.7+
  • Oracle 11g or later
  • SQLPlus (Oracle client)

License

This project is provided as-is for GPS timeline data management.

Support

For issues with:

  • Python scripts: Check the script output for detailed error messages
  • SQL syntax: Verify Oracle version compatibility
  • Timeline.json: Ensure it's exported from Android Timeline backup

Last Updated: January 31, 2026

About

Export Google timeline from a Json file to csv/database inserts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages