Offline Conversion Automated Uploader

Offline Conversion Automated Uploader is a command line tool that helps Facebook advertisers and marketing partners upload offline transactions to the Facebook marketing API without building their own application for API Integration.

Why use this tool?

Building API integration will require engineering resources. Typically, an engineer without experience working with our Facebook marketing API will need about 3 weeks for development and testing.
In order to achieve the best possible match between your customers and Facebook users, the data needs to be normalized and hashed correctly. MDFU tool uses the libraries written by Facebook to ensure the best possible match rate.
For any issues with this tool, you will get support from Facebook.
This tool will be updated periodically to support more features.

Prerequisites

An offline event set. How to create
Your offline event file in CSV format. Examples
Strongly recommended: manually upload your offline events via UI to familiarize yourself with this product.
System user access token. Follow step 2 and 3 of this guide.. Not required if you only use this tool to hash the data.

How to use

Install

Make sure git and node are installed and are up-to-date on your machine, then run:

git clone https://github.com/facebookincubator/offline-conversion-file-uploader
npm install
npm run compile

Examples

Before uploading real data, you can create a test data set and upload some test data to familiarize yourself with this tool. Take a look at this guide and have a try.

Run Command

Use validate command when you are setting up the tool and config. Check the report generated by this command and resolve issues. Once you are done with configuration, use upload command to upload events.

If you want to separate the hashing and uploading into two steps. You can use preprocess command first to hash the file, and then use upload-preprocessed command to upload the hashed file.

node lib/cli.js <COMMAND> --configFilePath <PATH_TO_CONFIG> <...OTHER_PARAMS>

See the Commands section for details of each command. See the previous section for some sample commands.

Schedule

Once you make sure an upload is made successfully, you can automate your upload by scheduling the command. We recommend to use crontab for POSIX systems, and use Powershell and Task Scheduler for Windows.

Commands

Command	Description	Options (Required options are in bold)
upload	Hash and upload the CSV file to Facebook.	`accessToken` `configFilePath` `dataSetID` `inputFilePath` `mapping` `batchSize` `customTypeInfo` `delimiter` `format` `header` `ignoreSampleErrors` `logging` `namespaceID` `presetValues` `reportOutputPath` `skipRowsAlreadyUploaded` `uploadID` `uploadTag` `uploadTagPrefix`
validate	Do a dry-run and validate some sample rows.	`accessToken` `configFilePath` `dataSetID` `inputFilePath` `mapping` `customTypeInfo` `delimiter` `format` `header` `logging` `namespaceID` `numRowsToValidate` `presetValues` `reportOutputPath`
preprocess	Hash the CSV file and store it locally without sending it Facebook. It also normalizes other fields, such as converting ISO-formatted event time into unix time.	`inputFilePath` `configFilePath` `mapping` `customTypeInfo` `delimiter` `format` `header` `ignoreSampleErrors` `logging` `preprocessOutputPath` `presetValues` `reportOutputPath`
upload-preprocessed	Upload the hashed CSV generated by `preprocess` command.	`accessToken` `dataSetID` `inputFilePath` `batchSize` `configFilePath` `ignoreSampleErrors` `logging` `namespaceID` `reportOutputPath` `skipRowsAlreadyUploaded` `uploadID` `uploadTag` `uploadTagPrefix`
upload-audience	Add or remove users to/from your custom audience.	`accessToken` `configFilePath` `customFileSource` `inputFilePath` `mapping` `adAccountID` `appIDs` `batchSize` `customAudienceID` `delimiter` `format` `header` `ignoreSampleErrors` `logging` `pageIDs` `removeUsers` `reportOutputPath`

Options

How to Specify

Most options can be specified in both ways.

For example, if you want to specify accessToken, you can either do it by passing it directly to CLI command

node lib/cli.js <COMMAND> --accessToken <YOUR_ACCESS_TOKEN> ...

or by setting it in the config JSON specified by configFilePath.

{
    "accessToken": <YOUR_ACCESS_TOKEN>,
    ...
}

For boolean options like skipRowsAlreadyUploaded, when specifying in CLI, there is no need to explicitly set it to true

node lib/cli.js <COMMAND> --skipRowsAlreadyUploaded ...

All Options

Here is a list of all options we support. See previous section for options supported/required by each command. See Resuming Support to see why uploadTag/uploadTagPrefix/uploadID are deprecated.

Option	Description	Where to specify	Default	Example
accessToken	Access token required to make API calls to Facebook server.	CLI or Config JSON	No default	`"EAAC...T4ZD"`
dataSetID	ID of offline event set you want to upload events to.	CLI or Config JSON	No default	`"123456789"`
uploadTag	(Deprecated) Tag to identify the events uploaded. Should use unique string for each distinct file uploaded.	CLI or Config JSON	No default	`"Offline Conversions"`
uploadTagPrefix	(Deprecated) Instead of providing uploadTag, you can also define prefix (ex: Offline Conversions), then the tool will append filename/timestamp and use it as the uploadTag. If uploadTag is set, uploadTagPrefix is ignored. Example: Offline Conversions (example_events_big_100k.csv@1493837377000)	CLI or Config JSON	No default	`"Offline Conversions"`
uploadID	(Deprecated) ID of the uploadTag.	CLI or Config JSON	No default	`"123456789"`
skipRowsAlreadyUploaded	Rows will be skipped if part of the same file was already uploaded before.	CLI or Config JSON	`false`	`true`
ignoreSampleErrors	The command will be stopped if there are too many errors in first 1,000 rows when this options is not specified. Use this options to override the behavior and to forcefully continue execution.	CLI or Config JSON	`false`	`true`
inputFilePath	Path of input CSV.	CLI or Config JSON	No default	`"path/to/offline_data.csv"`
batchSize	Configures how many events are sent to Facebook server in one API call. Ranges from 1 to 2000. Lower the number if network is slow or unstable.	CLI or Config JSON	`2000`	`500`
delimiter	Delimiter of CSV file.	CLI or Config JSON	`","`	`"\t"`
header	Whether to mark the first row as header and skip it when uploading.	CLI or Config JSON	`false`	`true`
mapping	Defines what each column in the file is for.	Config JSON Only	No default	See the mapping section below.
format	Provide more information about the formatting of certain types of columns.	Config JSON Only	`{}`	See the mapping section below.
customTypeInfo	Provide more information for each custom data column in the mapping.	Config JSON Only	`{}`	See the mapping section below.
presetValues	Provide default value for missing mapping.	Config JSON Only	`{}`	See the mapping section below.
logging	Level of logging. See supported levels here.	CLI or Config JSON	`"verbose"`	`"debug"`
namespaceID	Namespace ID for third-party ID mapping.	CLI or Config JSON	No default	`"123456789"`
numRowsToValidate	Max number of sample rows to validate when running `validate` command. Ranges from 1 to 1000.	CLI or Config JSON	`1000`	`500`
preprocessOutputPath	Path of output of `preprocess` command. Hashed events will be generated into the path. The file will be truncated if it already exists.	CLI or Config JSON	`"preprocess-output.csv"`	`"hashed.csv"`
reportOutputPath	Path of output report of each command. The report contains summary, issues and error samples. The file will be truncated if it already exists.	CLI or Config JSON	`"report.txt"`	`"upload-report.txt"`
adAccountID	ID of your ad account if you are creating a new audience with this upload.	CLI or Config JSON	No default	`"123456789"`
customAudienceID	ID of your custom audience if you are uploading to an existing audience.	CLI or Config JSON	No default	`"123456789"`
customFileSource	Describes how the customer information in your Custom Audience was originally collected.	CLI or Config JSON	`USER_PROVIDED_ONLY`, `PARTNER_PROVIDED_ONLY`, `BOTH_USER_AND_PARTNER_PROVIDED`	`"USER_PROVIDED_ONLY"`
retentionDays	Number of days to keep the user in this cluster. You can use any value between 0 and 180 days. Defaults to forever if not specified, or specified to 0.	CLI or Config JSON	`0`	`90`
removeUsers	Remove uploaded users from custom audience.	CLI or Config JSON	`false`	`true`
appIDs	A list of app IDs when appuid is mapped.	Config JSON Only	No default	`["123456789", "987654321"]`
pageIDs	A list of page IDs when pageuid is mapped.	Config JSON Only	No default	`["123456789", "987654321"]`

Mapping for Offilne Events

The mapping field is a dictionary that maps the index of a column to offline event schema. Here is an example:

"mapping": {
  "0": "match_keys.email",
  "1": "match_keys.phone",
  "2": "event_time",
  "3": "event_name",
  "4": "value",
  "5": "currency",
  "7": "custom_data.sales_type",
  "9": "custom_data.department"
}

The key of mapping is the index of column, starting with 0. The value of mapping field can be one of the values listed in the following table:

Mapping	Is Required	Description
match_keys.XXXXX	Required.	The identifier info used to match people. XXXXX needs to be replaced with the match key type such as email, phone, etc. For list of available match key types, please see 'Key name' column in this table.
event_time	Required.	Time when the offline conversion happened. The data in the corresponding column is recommend to be formatted in ISO8601 (example: `2018-03-09T19:17:19.345Z`) or unix time (example: `1520623023`).
event_name	Required. Can be either specified in `mapping` or in `presetValues`.	See `event_name` row in the data parameters table.
value	Required if exists event having `Purchase` as `event_name`. Optional for other cases.	Value of conversion event. Required for Purchase event. We do not accept 0 and negative values. Example: `16.00`.
currency	Required if `value` is mapped. Can be either specified in `mapping` of in `presetValues`.	Three-letter ISO currency for this conversion event. Required when and only when value presents.
order_id	Optional but recommended. Required if `item_number` is mapped.	The unique ID associated with each of your transactions.
item_number	Optional.	Item number is to distinguish different items within same order.
custom_data.XXXXX	Optional.	Additional information about the conversion event. For example, send store location ID as custom_data.location_id.

We also provide a presetValues option for your convenience if your CSV file does not contain a column for event_name or currency. For example, if all of your events are purchases and uses USD as currency, you can use the following presetValues:

"presetValues": {
    "event_name": "Purchase",
    "currency": "USD"
}

Note that you cannot specify a field's value in both presetValues and mapping. For example, in the use case above, you are not allowed to map any column to event_name or currency in your mapping, since they are already specified in presetValues.

For some specific mapping, namely match_keys.dob and event_time, you also need to specify the format of your data in the format option.

For match_keys.dob we support the following format:

MM/DD/YYYY MMDDYYYY MM-DD-YYYY
DD/MM/YYYY DDMMYYYY DD-MM-YYYY
YYYY/MM/DD YYYYMMDD YYYY-MM-DD
MM/DD/YY MMDDYY MM-DD-YY
DD/MM/YY DDMMYY DD-MM-YY
YY/MM/DD YYMMDD YY-MM-DD

And for event_time we support ISO8601 and unix_time. We also support formats listed above for event_time, but that's not recommended, as we recommend event_time accurate to minutes or seconds.

For event_time, we provide an additional option to offset the time zone. For ISO8601 and unix_time time zone is not needed, so please set it to 0. For other formats, you can set it to an number that represents offset (in hours) that should be applied to the event_time.

Here is an example of format, note that the format of event_time and match_keys.dob differs slightly:

"format": {
  "event_time": {
    "timeFormat": "ISO8601"
  },
  "dob": "MM/DD/YYYY"
}

The last piece of mapping is customTypeInfo. If you don't map any custom data fields, please leave it unspecified or as an empty dictionary {}. If you mapped custom data field, for each custom data field, you need to specify its type to be either string or number type. For example, if you've mapped custom_data.margin_value and custom_data.department_name, you need to set customTypeInfo to

"customTypeInfo": {
  "margin_value": {
    "baseType": "number"
  },
  "department_name": {
    "baseType": "string"
  }
}

Mapping for Custom Audience

The mapping for custom audience is slightly different from offline events.

Mapping	Is Required	Description
match_keys.XXXXX	Required.	Same to offline events.
lookalike_value	Optional.	The life-time value of a user. This is for value-based lookalike audience. We do not accept 0 and negative values.

Resuming Support

Resuming is a way to ensure each row is uploaded once and only once, regardless of process or machine crashes, or network failures. Only offline events upload is supported. For custom audience upload, you can safely upload duplicated rows of a same user.

The following prerequisites should be satisfied to support resuming:

You should use different file names for each input file. A way to guarantee that is to add a date stamp at each file. For example, you use Offline conversions 2018-03-01.csv to represent sales happen that day (or that week).
Once the input file is generated, you don't change it.
The anti-pattern is to use one file name for all events, and you update the file for new events. Don't do this.

Then when you call the script to upload the events, make sure:

Do not specify uploadTag/uploadTagPrefix/uploadID. We'll automatically generate an uploadTag for you based on the name and size of input file, we use the auto generated uploadTag to discover the range of events that were already uploaded.
Set skipRowsAlreadyUploaded. This options will allow the uploader to skip the rows that falls into the ranges uploaded before.

FAQ:

Q: There's another tool called MDFU, which one should I use?

We previously built a tool named MDFU with similar functionality. Use this tool whenever possible, the MDFU will be deprecated soon. This tool provides the following additional functionality:

It generates a report to help troubleshooting issues with your data file.
It is more robust, because it uses the battle-proven uploader core that is used in the web version.
It supports separating upload and preprocessing into two steps, so you can verify that data is hashed properly before sending to Facebook.
It supports a validate command which does a dry-run on some sample rows of your data file, so you have a chance to fix the issue before sending them to Facebook.
It supports resuming, so you can upload same file again without causing duplications.

Q: My company has firewall which will block API calls to Facebook. What are my options?

Whitelist Facebook IP's: Contact your security team to whitelist IP addresses returned by this command:
```
whois -h whois.radb.net -- '-i origin AS32934' | grep ^route
```
For more information, please refer to this guide which explains whitelisting for Facebook crawlers, but the same set of IP's are used for API servers.
Request your security team to create DMZ where outbound HTTP request is allowed.

Q: How this tool works?

This is a node.js application that will go through following steps to upload your offline conversions to Facebook's marketing API. Here is how the tool uploads data:

Read configurations.
Read input file in stream.
For each line read, columns are normalized and hashed for upload.
Collect normalized and hashed data into batches.
POST each batch to the API endpoint.

License

Facebook Offline Conversion Automated Uploader is BSD-licensed. We also provide an additional patent grant.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
demo-audience		demo-audience
demo-offline-events		demo-offline-events
docker		docker
e2e-test		e2e-test
src		src
.babelrc		.babelrc
.eslintignore		.eslintignore
.eslintrc		.eslintrc
.flowconfig		.flowconfig
.gitignore		.gitignore
README.md		README.md
package.json		package.json

supasate/offline-conversion-file-uploader

Folders and files

Latest commit

History

Repository files navigation