# Demo Code: Export&Import a whole project 

There are 3 methods shown below to Export&Import a whole project. Only Method 1 is recommended.

## Prerequisite
Create your own yaml based on [configuration_template.yaml](./configuration_template.yaml), and modify "CONFIG_FILE" below.

In [9]:
# point to a local file with credentials. It is not synced to git.
CONFIG_FILE = "export_import_whole_project.yaml"

In [10]:
import sys
import os
import yaml
import subprocess

In [11]:
# Load parameters from the YAML file
with open(CONFIG_FILE, 'r') as file:
    config = yaml.safe_load(file)

DSJOB_URL = config['url']
DSJOB_USER = config['user']
DSJOB_PWD = config['password']
EXPORT_PRJ_NAME = config['export_prj_name']

if 'export_zip_name' in config:
    EXPORT_ZIP_NAME = config['export_zip_name']
else:
    EXPORT_ZIP_NAME = EXPORT_PRJ_NAME+"_whl_prj.zip"

IMPORT_PRJ_NAME = config['import_prj_name']

print("EXPORT_PRJ_NAME=",EXPORT_PRJ_NAME)
print("IMPORT_PRJ_NAME=",IMPORT_PRJ_NAME)

EXPORT_PRJ_NAME= Multicloud Data Integration L3 Tech Lab
IMPORT_PRJ_NAME= DataStage Import


In [12]:
# Enable dsjob with by settign below env values
%env CPDCTL_ENABLE_DSJOB=true
%env CPDCTL_ENABLE_DATASTAGE=true
%env CPDCTL_ENABLE_VOLUMES=1

env: CPDCTL_ENABLE_DSJOB=true
env: CPDCTL_ENABLE_DATASTAGE=true
env: CPDCTL_ENABLE_VOLUMES=1


In [13]:
# Configure cpdctl with the parameters
!cpdctl config user set CP4D-user --username "$DSJOB_USER" --password "$DSJOB_PWD"
!cpdctl config profile set CP4D-profile --url "$DSJOB_URL" --user CP4D-user
!cpdctl config profile use CP4D-profile

Switched to profile "CP4D-profile".


In [24]:
# list all projects
!cpdctl dsjob list-projects

...
DataStage Import
test_python
Multicloud Data Integration L3 Tech Lab
test_python_pipeline

Total: 4 Projects

Status code = 0


## Method 1: Use **"export-datastage-assets"** and **"import-zip"** command <span style="color:red">--Recommend<span>

To do: need to understand what will happen for existing jobs. Likely existing jobs won't be replaced.

In [25]:
# Multicloud Data Integration L3 Tech Lab
!cpdctl dsjob export-datastage-assets --project "$EXPORT_PRJ_NAME" --file-name "$EXPORT_ZIP_NAME" 

...
2024-06-03 21:51:27: Waiting until export finishes, Status: started
2024-06-03 21:51:38: Project export status: completed, total: 8, completed: 8, failed: 0.
status   completed

Status code =  0


In [8]:
# only when project doesn't exist, create project
!cpdctl dsjob create-project --name "$IMPORT_PRJ_NAME" 

2b527970-3327-4e4e-87a0-739a911493ec


In [26]:
!cpdctl dsjob import-zip --project "$IMPORT_PRJ_NAME" --file-name "$EXPORT_ZIP_NAME" --conflict-resolution replace --wait 200 

...
2024-06-03 21:51:43: Waiting until import finishes, import id: c8be1508-dded-4be9-b75f-bd0452408f90
2024-06-03 21:51:54: Project import status: started,  total: 8, completed: 1, failed: 0, skipped: 3.
2024-06-03 21:52:14: Project import status: started,  total: 8, completed: 3, failed: 0, skipped: 4.
2024-06-03 21:52:35: Project import status: completed,  total: 8, completed: 4, failed: 0, skipped: 4.
Information:
	Connection: Data Virtualization,	  New connection is exactly the same as an existing connection, resource is not updated.

	Connection: Data Warehouse,	  New connection is exactly the same as an existing connection, resource is not updated.

	Parameter Set: JB_parameter_set,	  New parameters are identical to those in the existing parameter set `JB_parameter_set`, flow is updated to reference `JB_parameter_set`.

	Parameter Set: paraset2,	  New parameters are identical to those in the existing parameter set `paraset2`, flow is updated to reference `paraset2`.


Status cod

## Method 2: Use **"export"** and **"import"** command -- <span style="color:red">Not recommend<span>
There is a known issue that it will miss some connections. We are working with IBM support to solve it. Current workaround is to recreate connections manually after importing to a new project.

In [14]:
!cpdctl dsjob export --project "$EXPORT_PRJ_NAME" --name "$EXPORT_PRJ_NAME" --export-file "$EXPORT_ZIP_NAME" --wait -1

...
pending
pending
pending
running
running
running
running
running
completed

Status code =  0


In [11]:
!cpdctl dsjob save-export --project "$EXPORT_PRJ_NAME" --name "$EXPORT_PRJ_NAME" --export-file "$EXPORT_ZIP_NAME"

...

Status code = 0


### Import to a new project

1. If the project to import exist, delete it
2. Please note the project can't be deleted if the project has existing jobs

In [49]:
def delete_all_jobs(prj_name):

    import subprocess
    
    # Define the command to be run
    command = f'cpdctl dsjob list-jobs --project "{prj_name}" --with-id --sched-info'
    #print('command=',command)
    #command = 'cpdctl dsjob list-job-status --project "DataStage Import"'
    output = subprocess.run(command, capture_output=True, text=True, shell=True)
    print(output.stdout)
    
    import re
    # Regular expression to match job details
    pattern = re.compile(r'^\s*(.*?)\s*\|\s*([0-9a-f-]+)\s*\|', re.MULTILINE)
    # Find all matches
    matches = pattern.findall(output.stdout)
    # Extract job names
    job_names = [match[0].strip() for match in matches]
    # Remove the first '--------' entry if it exists
    if '--------' in job_names:
        job_names.remove('--------')
    # Print the job names
    print("Found Jobs:\n",job_names)
    
    for job_name in job_names: 
        !cpdctl dsjob delete-job --project "$IMPORT_PRJ_NAME"  --name "$job_name"
        print(job_name, "has been deleted")

In [50]:
# delete an existing project
import subprocess

# Execute the command
command = f'cpdctl project list --name "{IMPORT_PRJ_NAME}" --match exact'
result = subprocess.run(command, shell=True, capture_output=True, text=True)

# Capture the return code and output
return_code = result.returncode
output = result.stdout

# Print the return code and output
print("Return Code:", return_code)
print("Output:", output)

# Check if the output contains "Nothing to show"
if "Nothing to show" in output:
    print("The project does not exist or there was an error.")

else:
    print("The project exists. Deleting project",IMPORT_PRJ_NAME)
    delete_all_jobs(IMPORT_PRJ_NAME)
    # if exist delete project
    !cpdctl dsjob delete-project --project "$IMPORT_PRJ_NAME"


Return Code: 0
Output: ...
ID                                     Name               Created                    Description                                  Type
866edac4-9c9b-48a7-bb4c-1f51b44e860a   DataStage Import   2024-06-04T00:50:51.126Z   This Project is created using dsjob plugin   cpd

The project exists. Deleting project DataStage Import
...
Job Name                      |Job ID                              |Schedule Information
--------                      |------                              |--------------------
Employee Ranking.DataStage job|4f139c64-c23a-4279-a423-e0f37a9f4469|

Total: 1 Jobs

Status code = 0

Found Jobs:
 ['Employee Ranking.DataStage job']
...
Deleted Job:  Employee Ranking.DataStage job

Status code =  0
Employee Ranking.DataStage job has been deleted
...
{
    "StatusCode": 204,
    "Headers": {
        "Date": [
            "Tue, 04 Jun 2024 01:05:58 GMT"
        ],
        "Server": [
            "---"
        ],
        "Set-Cookie": [
          

In [51]:
# Create prj
!cpdctl dsjob create-project --name "$IMPORT_PRJ_NAME" 

d0c4a78e-bfe4-40e2-b7cd-04c2a39325ea


In [52]:
!cpdctl dsjob import --project "$IMPORT_PRJ_NAME" --import-file "$EXPORT_ZIP_NAME" --wait -1

...

ID: 9cf56e46-997b-495b-8cd9-44e51783d65b
running
completed

Status code =  0


## Method 3: Use **export-project** and **import-zip** command -- <span style="color:red">Not recommend<span>
There is a known issue that it will miss some connections. We are working with IBM support to solve it. Current workaround is to recreate connections manually after importing to a new project.

In [16]:
!cpdctl dsjob export-project --project "$EXPORT_PRJ_NAME" --file-name "$EXPORT_ZIP_NAME" --wait -1

...
2024-06-03 21:15:30: Waiting until export finishes, Status: started
2024-06-03 21:15:41: Project export status: completed, total: 5, completed: 5, failed: 0, skipped: 0.

Status code =  0


In [17]:
#only when project doesn't exist, create project
!cpdctl dsjob create-project --name "$IMPORT_PRJ_NAME" 

[31;1mFAILED[0m
Error creating the  project.:
Bad Request



In [None]:
!cpdctl dsjob import-zip --project "$IMPORT_PRJ_NAME" --file-name "$EXPORT_ZIP_NAME" --conflict-resolution replace --wait 200 

...
2024-06-03 21:15:48: Waiting until import finishes, import id: 3f5c31dd-da30-4817-a03f-63089524dd34
2024-06-03 21:15:59: Project import status: started,  total: 5, completed: 1, failed: 0, skipped: 2.
