To use AWS Glue to connect your software data stored in **MariaDB** and **MongoDB** databases through the **command line (CLI)**, you can follow these steps. This process involves creating connections, setting up crawlers, and running ETL jobs using AWS CLI commands.

### **1. Set Up IAM Role for AWS Glue**
First, ensure you have an IAM role with the necessary permissions to access AWS Glue, S3, and your databases (MariaDB and MongoDB).

Follow along the tutorial in 'setup-role-and-policy.ipynb'.

---

### **2. Create a JDBC Connection for MariaDB**

#### A. Upload JDBC Driver to S3:
Download the **MariaDB JDBC Driver** and upload it to an S3 bucket (e.g., `s3://my-bucket/mariadb-driver.jar`).

```bash
aws s3 cp /path/to/mariadb-driver.jar s3://my-bucket/mariadb-driver.jar
```

#### B. Create MariaDB Connection in AWS Glue:

```bash
aws glue create-connection \
  --connection-input '{
    "Name": "mariadb-connection",
    "ConnectionType": "JDBC",
    "ConnectionProperties": {
      "JDBC_CONNECTION_URL": "jdbc:mysql://<mariadb-host>:<port>/<database-name>",
      "USERNAME": "<username>",
      "PASSWORD": "<password>",
      "JDBC_DRIVER_JAR_URI": "s3://my-bucket/mariadb-driver.jar"
    }
  }'
```

Replace `<mariadb-host>`, `<port>`, `<database-name>`, `<username>`, and `<password>` with the appropriate values for your MariaDB instance.

#### C. Test the Connection:

```bash
aws glue get-connection --name mariadb-connection
```

---

### **3. Create a MongoDB Connection**

#### A. Create MongoDB Connection in AWS Glue:

```bash
aws glue create-connection \
  --connection-input '{
    "Name": "mongodb-connection",
    "ConnectionType": "MongoDB",
    "ConnectionProperties": {
      "MONGO_URI": "mongodb://<user>:<password>@<host>:<port>/<database-name>?authSource=admin"
    }
  }'
```

Replace `<user>`, `<password>`, `<host>`, `<port>`, and `<database-name>` with your MongoDB credentials and connection details.

#### B. Test the Connection:

```bash
aws glue get-connection --name mongodb-connection
```

---

### **4. Create Crawlers for MariaDB and MongoDB**

#### A. Create a Crawler for MariaDB:

```bash
aws glue create-crawler \
  --name "mariadb-crawler" \
  --role "arn:aws:iam::<account-id>:role/GlueRole" \
  --database-name "mariadb-database" \
  --targets '{"jdbcTargets":[{"connectionName":"mariadb-connection","path":"<table-name>"}]}'
```

Replace `<account-id>`, `mariadb-database`, and `<table-name>` with appropriate values.

#### B. Create a Crawler for MongoDB:

```bash
aws glue create-crawler \
  --name "mongodb-crawler" \
  --role "arn:aws:iam::<account-id>:role/GlueRole" \
  --database-name "mongodb-database" \
  --targets '{"mongoDBTargets":[{"connectionName":"mongodb-connection","path":"<collection-name>"}]}'
```

Replace `<account-id>`, `mongodb-database`, and `<collection-name>` with appropriate values.

---

### **5. Start Crawlers**

After creating the crawlers, start them to populate the Glue Data Catalog with your MariaDB and MongoDB schema.

#### A. Start the MariaDB Crawler:

```bash
aws glue start-crawler --name mariadb-crawler
```

#### B. Start the MongoDB Crawler:

```bash
aws glue start-crawler --name mongodb-crawler
```

---

### **6. Create an ETL Job to Transform Data**

You can use AWS Glue to perform ETL (Extract, Transform, Load) operations on your data. Here's how you can create a simple job using the AWS CLI.

#### A. Create an ETL Job:

For a simple example, assume you want to join data from both MariaDB and MongoDB and write it to S3.

```bash
aws glue create-job \
  --name "mariadb-mongodb-etl-job" \
  --role "arn:aws:iam::<account-id>:role/GlueRole" \
  --command '{
    "Name": "glueetl",
    "ScriptLocation": "s3://my-bucket/scripts/mariadb-mongodb-etl-script.py"
  }' \
  --default-arguments '{
    "--TempDir": "s3://my-bucket/temp/",
    "--additional-python-modules": "pyspark==3.0.1"
  }'
```

In the `ScriptLocation`, replace the script path with the S3 location of your custom ETL script.

#### B. Write the ETL Script (Example in Python)

Here’s a simple PySpark ETL script that reads from both MariaDB and MongoDB, performs a transformation, and writes to S3:

```python
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.dynamicframe import DynamicFrame

args = getResolvedOptions(sys.argv, ['JOB_NAME'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

# Read from MariaDB (using Glue Data Catalog)
mariadb_dynamic_frame = glueContext.create_dynamic_frame.from_catalog(
    database="mariadb-database",
    table_name="mariadb-table"
)

# Read from MongoDB (using Glue Data Catalog)
mongodb_dynamic_frame = glueContext.create_dynamic_frame.from_catalog(
    database="mongodb-database",
    table_name="mongodb-collection"
)

# Example Transformation: Join the data (adjust based on your data schema)
joined_dynamic_frame = mariadb_dynamic_frame.join(
    paths=["id"],
    frame2=mongodb_dynamic_frame,
    transformation_ctx="joined_data"
)

# Write the result to S3 in JSON format
glueContext.write_dynamic_frame.from_options(
    frame=joined_dynamic_frame,
    connection_type="s3",
    connection_options={"path": "s3://my-output-bucket/"},
    format="json"
)

job.commit()
```

Upload this script to an S3 bucket (e.g., `s3://my-bucket/scripts/mariadb-mongodb-etl-script.py`).

---

### **7. Start the ETL Job**

To start the ETL job:

```bash
aws glue start-job-run --job-name mariadb-mongodb-etl-job
```

---

### **8. Monitor the Job**

You can monitor the job status and logs using the following command:

```bash
aws glue get-job-run --job-name mariadb-mongodb-etl-job --run-id <job-run-id>
```

You can find the `job-run-id` from the output of the `start-job-run` command or by listing all runs.

---

### **9. Cleanup**

To delete the job, crawlers, and connections after your work is done:

```bash
aws glue delete-job --job-name mariadb-mongodb-etl-job
aws glue delete-crawler --name mariadb-crawler
aws glue delete-crawler --name mongodb-crawler
aws glue delete-connection --name mariadb-connection
aws glue delete-connection --name mongodb-connection
```

---

### **Conclusion**

By using the AWS CLI, you can automate the process of setting up connections to MariaDB and MongoDB, running crawlers to catalog data, and executing ETL jobs in AWS Glue. This provides a scalable and automated way to process data across different sources and load it to targets like S3.