Skip to content
This repository has been archived by the owner on Aug 15, 2023. It is now read-only.
/ toMixpanel Public archive

🚰 ETL data into Mixpanel from many sources

Notifications You must be signed in to change notification settings

ak--47/toMixpanel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

to Mixpanel

update: 8/15/23

tldr; this package is deprecated. use the new stuff.

originally the design of this module was an "all-in-one" ETL to bring data out of certain vendor tools into mixpanel to allow people to self-serve on data migrations and historical backfills.

but, product analytics data is BIG, and doing any kind of significant data volume on your personal computer (over a consumer grade internet connection), with no way to "resume" a job, is unreliable.

mixpanel has evolved dramatically over the last few years and with the release of simple identity merge and our migration packages, much of the functionality in this this module is no longer relevant.

i have opted not to delete this repository, as the code works for certain cases (small data sets, mixpanel's original identity merge with $merge events, and incremental user props), however this module will receive no further updates.

if you are looking to migrate your data from a vendor specific format to mixpanel, these are the packages you want:

Self-Serve Data Migration Tools

Amplitude

Heap

Adobe

Generic

if you need help moving your historical data to mixpanel contact us ... we will help! 💪


wat.

toMixpanel is an ETL script in Node.js that provides one-time data migrations from common product analytics tools... to mixpanel.

It implements Mixpanel's /import , $merge and /engage endpoints

It uses service accounts for authentication, and can batch import millions of events and user profiles quickly.

This script is meant to be run locally and requires a JSON file for configuration.

tldr;

git clone https://github.com/ak--47/toMixpanel.git

cd toMixpanel/

npm install

node index.js ./path-To-JSON-config

alternatively:

npx to-mixpanel ./path-To-JSON-config

Detailed Instructions

Install Dependencies

This script uses npm to manage dependencies, similar to a web application.

After cloning the repo, cd into the /toMixpanel and run:

npm install

this only needs to be done once.

Config File

toMixpanel requires credentials for your source and your destination

Here's an example of a configuration file for amplitude => mixpanel:

{
  "source": {
    "name": "amplitude",
    "params": {
      "api_key": "{{amplitude api key}}",
      "api_secret": "{{ amplitude api secret }}",
      "start_date": "2021-09-17",
      "end_date": "2021-09-17"
    },
    "options": {
      "save_local_copy": true,
      "is EU?": false
    }
  },
  "destination": {
    "name": "mixpanel",
    "project_id": "{{ project id }}",
    "token": "{{ project token }}",
    "service_account_user": "{{ mp service account }}",
    "service_account_pass": "{{ mp service secret }}",
    "options": {
      "is EU?": false,
	  "recordsPerBatch": 2000
    }
  }
}

you can find more configuration examples in the repo.

supported sources

required params: api_key, api_secret, start_date, end_date, is EU?

that's right! you can use toMixpanel to migrate one mixpanel project to another!

required params: token, secret, start_date, end_date, is EU?, do_events, do_people

options: where (see docs), event (see docs), recordsPerBatch (in destination)

required params: filePath, event_name_col, distinct_id_col, time_col, insert_id_col (note: filePath can be EITHER a path to a CSV file or a folder which contains multiple CSV files)

required params: project_id, bucket_name, private_key_id, private_key, client_email, client_id, auth_uri, token_uri, auth_provider_x590_cert_url, client_x509_cert_url options: path_to_data (for large datasets, does line-by-line iteration)

*note: google analytics does not have public /export APIs, so you'll need to export your data to bigQuery first, and then export your bigQuery tables to google cloud storage as JSON. You can then create a service account in google cloud storage which can access the bucket; the above-mentioned values are given to you when you create a service account

About

🚰 ETL data into Mixpanel from many sources

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published