-
Notifications
You must be signed in to change notification settings - Fork 0
Loading data by using JSON files
Developers load their supply chain data so that the IBM Intelligence Suite can get up and running and display the relevant data on each dashboard. As an alternative to using CSV files, you can load your data by using JSON files.
Note: The recommend approach is to load data by using CSV files, so that your data is easy to build, consume, and initially load. CSV files are the recommended method because they can be used to efficiently set up demo data, daily data integration, and automated data loads for production data.
- Review the Data model. If you are loading data for the first time, the initial data that you load are typically the records that contain a high data ingest priority in the data model, where a value of 1 indicates must-haves to get started using the Intelligence Suite, and values of 2 or 3 indicate nice-to-haves such as details on widgets and other pages.
- Put together your ingest data and then complete this task. As a developer, you will send data in batches from external data stores such as IBM Cloud Object Store (COS). In this task, you will learn how to write a rule to programmatically import events from a Cloud Object Store bucket into your tenant as BusinessObjectEvents, and then query those events.
- Using an HTTP tool such as Postman, set the URL to https://SCIS_endpoint/api/v1/data/import/rules
- Set the method to POST.
- Content-Type: application/json
- Set the request's Body to the following content:
{
"name": "My first COS bulk import rule",
"tenantId": "{{tenant_ID}}",
"instructions": {
"source": {
"sourceType": "COS",
"COS": {
"bucketName": "{{bucket_name}}",
"prefixName": "{{prefix_name}}",
"endpoint": "{{COS_private_endpoint}}",
"instanceId": "{{COS_instance_ID}}",
"bucketRegion": "{{COS_bucket_location}}",
"secretName": "{{secret_name}}"
}
},
"routeTo": {
"destinationType": "streamingIngest",
"streamingIngest": {
"priority": "1"
}
}
}
}
Note: To obtain the endpoint and region values above, obtain these values by clicking on Configuration for your bucket definition in COS. Make note of the private endpoint and location.
Note: If you receive the following error, ensure that you are using a private endpoint in your import rule.
"status": {
"code": "loading_error",
"details": "Version 1.0 of this rule was not able to be loaded due to the following error: Failed to get list of unprocessed objects due to Client execution did not complete before the specified timeout configuration."
}
- Using an HTTP tool such as Postman, set the URL to https://SCIS_endpoint/api/v1/data/import/rules/rule_id/status
- Set the method to GET.
- The new rule's status might be returned in the following format:
{
"code": "loading",
"details": "Version 1.0 of this rule is in the process of being loaded. It will not start processing until it is finished loading"
}
- After approximately 30 seconds, issue the GET call again until the new rule's status returns in the following format:
{
"code": "live",
"details": "Version 1.0 of this rule is live and processing"
}
Ensure that the file meets the following criteria:
- The file must contain individual events in JSON format, with each event on a new line.
- The file name cannot contain any of these characters: /":'?<>&*|.
- The file can be in a compressed format (only gzip is supported), the file name must end with .gz or .gzip.
- All the events in each uploaded file is assigned an unique batchId, which can be used to query the events or objects through the batchTraceId attribute in a simple filter.
- Below are some sample events. Place these in a file and edit the tenantIds appropriately. Alternatively you can remove the tenantIds from each event and specify a tag for the tenantId. (Do not format/prettify the JSON body). If you specified a prefixName in your rule, ensure that you upload it with that prefix.
Use the following samples as a reference, substituting your own tenant IDs.
{"eventCode":"objectUpsertEvent", "tenantId":"087c8b15-c0df-48e1-ae0b-b991aa8d1bf3", "timestampEventOccurred":"2021-03-22T03:49:11.024Z", "eventDetails":{ "businessObject":{ "type":"Product", "globalIdentifiers":[ { "name":"import_test_product_id", "value":"testProductIdABC123" } ], "name": "testProductABC1231", "partNumber": "testPartNumberABC1231", "value": 100000, "valueCurrency": "USD", "category": { "globalIdentifiers":[ { "name":"import_test_category_id", "value":"testCategoryIDABC123" } ] } } }}
{"eventCode":"objectUpsertEvent", "tenantId":"087c8b15-c0df-48e1-ae0b-b991aa8d1bf3", "timestampEventOccurred":"2021-03-22T03:49:11.024Z", "eventDetails":{ "businessObject":{ "type":"Catalog", "globalIdentifiers":[ { "name":"import_test_category_id", "value":"testCategoryIdABC123" } ], "name": "testCategoryABC123", "code": "testCodeABC123", "value": 100000 } }}
Optionally, instead of specifying a tenantId in each event within the file, you can instead add your tenantId into a tag called tenantId when you upload the file. If the tenantId is not specified in the event then the tenantId from the tag will be picked up and added to the event. Note that if you do specify tenantId in any events they must match the value in the tag. Do not format or prettify the JSON body. If you specified a prefixName in your rule, ensure that you upload it with the prefix. Otherwise, specify the prefix in your COS upload space.
The following is the minimum requirements for an event:
{
"eventCode":"objectUpsertEvent",
"tenantId":"xxxxxx-xxxx-xxxx-xxxxxxxxxx",
"eventDetails":{
"businessObject":{
"type":"Xxxxxxxxx", //valid Object type; eg DemandPlan
"globalIdentifiers":[
{
"name":"xxxxxxx",
"value":"yyyyyy"
}
]
}
}
}
- Log in to your IBM Cloud account.
- Open up the Resource list.
- Expand the Storage section and select your COS resource group.
- Select Storage in Resource list.
- Select Buckets in the left menu and find your BucketName.
- Click Upload.
- Upload the file to your bucket. If you specified a prefixName, ensure that you specify the prefix. There is a text box labeled Prefix for objects (optional) in the COS upload screen where you can enter your prefix. Ensure that you end the prefix with a / at the end. Do not add the prefix name to the name of the file that you will upload or COS will escape the backslash and your import will not proceed as expected.
The status will be uploaded into a file in your COS bucket for every batch of records uploaded. The progress file will be in the following format/location:
importStatus/streamingIngest/PrefixName/UploadedObjectFileName_fileUploadTimestamp_status.json in the same bucket used to get the data to import. The file can be downloaded from COS.
Note: It might take up to 2 minutes for the import service to see your new file on COS. A sample payload of the status resembles the following format:
{
"numberOfEventsImported":3,
"totalNumberOfRecords":4,
"bucketName":"cos-bulk-import-tutorial",
"importStartTimestamp":"2021-03-20T08:10:28.731Z",
"importCompletedTimestamp":"2021-03-20T08:10:30.732Z",
"consumedBytes":6388,
"numberOfEventsRejected":1,
"tenantId":"c60fe724-6224-4078-bcb3-a166118fa8a1",
"objectName":"importdemo/sampleImportTest1.json",
"batchId":"<batchId>",
"status":"COMPLETED",
"totalSizeBytes":6388
}
Query the object events that you uploaded. You can also grab the batch ID from the status file and run a simple filter query where batchTraceId matches the batch ID assigned to your import.
For example, to query the number of object events with a batchTraceId:
{
businessObjectEvents(
simpleFilter: {
batchTraceId: "batchTraceId",
tenantId: "tenantId"
}
) {
totalCount
}
}
If you see any errors posted in your status file, or your query does not return all your expected data, try querying for error events of type IngestionErrorEventDetails and look for possible formatting issues with your imported data.
- Using an HTTP tool such as Postman, set the URL to https://SCIS_endpoint/api/v1/data/import/rules/rule_id
- Set the method to PUT.
- Set the Body with the new rule. You can optionally verify the rule was updated by changing the method to GET, and verifying that the return value shows the updated version of your rule.
- Using an HTTP tool such as Postman set the URL to https://SCIS_endpoint/api/v1/data/import/rules/rule_id
- Set the method to DELETE. You can optionally verify that the rule was deleted by changing the method to GET, which should return the following error message:
{"message":"rule {{rule_id}} does not exist"}
Verify that your data has made its way into the system. Go to Data explorer and browse through your data to find the most relevant data objects that match your upload.
-
Recommended: Loading data by using CSV files It is recommended that you use a flat file simple canonical CSV format for your data loads, so that your data is easy to build, consume, and initially load. CSV files are the recommended method because they can be used to efficiently set up demo data, daily data integration, and automated data loads for production data.
-
Exploring the data model The data model demonstrates the relationships between the data objects used by the Intelligence Suite. The data that you load is used to populate the Control Tower dashboards and details pages.
-
Learn to import bulk data by creating import rules and by uploading object files to IBM Cloud Object Storage.
-
Learn to import bulk CSV data by creating import rules with transform and by uploading CSV object files to IBM Cloud Object Storage.