<div id="singlestore-header" style="display: flex; background-color: rgba(235, 249, 245, 0.25); padding: 5px;">
    <div id="icon-image" style="width: 90px; height: 90px;">
        <img width="100%" height="100%" src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/header-icons/database.png" />
    </div>
    <div id="text" style="padding: 5px; margin-left: 10px;">
        <div id="badge" style="display: inline-block; background-color: rgba(0, 0, 0, 0.15); border-radius: 4px; padding: 4px 8px; align-items: center; margin-top: 6px; margin-bottom: -2px; font-size: 80%">SingleStore Notebooks</div>
        <h1 style="font-weight: 500; margin: 8px 0 0 4px;">Backup Database to AWS S3</h1>
    </div>
</div>


## Intro

<p class="has-text-justified">
    Introducing a powerful Python notebook designed to simplify performing database backups on schedule
</p>

## What you will learn in this notebook:

1. How to backup database to AWS S3 [SQL]


## What benefits do you get out of using the notebook.

1. Using this parameterized notebook, user should be able to perform both initial and incremental backups to S3 with just input params.


## Questions?

Reach out to us through our [forum](https://www.singlestore.com/forum).

### Pre-requisites

We will need below parameters to proceed.

<ol type="A">
    <li>Singlestore provides secrets feature to store and manage sensitive data. we will use that to access AWS Access key ID,AWS Secret access key</li>
    <li>Database User should have 'BACKUP', 'OUTBOUND', 'PROCESS' grant</li>
    <li>S3 Path provided should not exist [ bucket should exists, remaining path will be created if not existing for initial backup]</li>
</ol>

<p>Note: </p>

<ol>
    <li>check user grants by running 'show grants'.</li>
    <li>S3 Path if not exists, will be created by singlestore.</li>
    <li>General format is 'database_name.backup'.    </li>
    <li>AWS IAM user should have S3 read,  &nbsp; write access</li>
</ol>


### Secrets

|SECRET NAME|DESCRIPTION|
|---|---|
|BACKUP_APP_AWS_API_KEY_ID| AWS IAM USER API KEY ID |
|BACKUP_APP_AWS_API_SECRET_KEY|  AWS IAM USER SECRET KEY |
|BACKUP_APP_AWS_BUCKET_REGION| AWS BUCKET REGION |

### Imports

In [1]:
import io
import logging
import time

import singlestoredb as s2
from IPython.display import display, HTML

### Variables

In [2]:
is_incremental = 'N'
database_to_bkp = None
backup_all_databases = input("Do you want to back up all databases? Enter 'Y' for yes, 'N' for no")
if backup_all_databases == 'N':
    database_to_bkp = input('Enter database name to backup')
s3_target_path = input('Enter S3 Path to use for backup')
if backup_all_databases == 'N':
    is_incremental = input("Do you require incremental backup. 'Y' for True, 'N' for False ")

### Functions to display various alerts

In [3]:
def show_warn(warn_msg):
    """
    Display a warning message in a formatted HTML alert box.

    Parameters
    ----------
    warn_msg : str
        The warning message to display.
    """
    display(HTML(f'''<div class="alert alert-block alert-warning">
    <b class="fa fa-solid fa-exclamation-circle"></b>
    <div>
        <p><b>Action Required</b></p>
        <p>{warn_msg}</p>
    </div>
</div>'''))

def show_error(error_msg):
    """
    Display an error message in a formatted HTML alert box.

    Parameters
    ----------
    error_msg : str
        The error message to display.
    """
    display(HTML(f'''<div class="alert alert-block alert-danger">
    <b class="fa fa-solid fa-exclamation-triangle"></b>
    <div>
        <p><b>Error</b></p>
        <p>{error_msg}</p>
    </div>
</div>'''))


def show_success(success_msg):
    """
    Display a success message in a formatted HTML alert box.

    Parameters
    ----------
    success_msg : str
        The success message to display.
    """
    display(HTML(f'''<div class="alert alert-block alert-success">
    <b class="fa fa-solid fa-check-circle"></b>
    <div>
        <p><b>Success</b></p>
        <p>{success_msg}</p>
    </div>
</div>'''))

### LogControl

In [4]:
def set_logging_enabled(enabled):
    if enabled:
        logging.getLogger().setLevel(logging.INFO)
    else:
        logging.getLogger().setLevel(logging.ERROR)

**Note**

To enable logs

 - Modify 'set_logging_enabled(False)' to 'set_logging_enabled(True)' in code below

### Utility functions for handling S3 PATHs, SQL Statement, backup

In [5]:
def get_bkp_path(s3_path, db_name):
    """
    Get the backup path based on the type of backup (incremental or initial).

    Parameters
    ----------
    s3_path : str
        The base S3 path for backups.
    db_name : str
        The name of the database.

    Returns
    -------
    str
        The final backup path.

    """
    if is_incremental == 'Y':
        logging.info('Is an incremental backup, will use exact path')
        return s3_path
    else:
        logging.info('Is an initial backup, will use time appended path')
        t = time.localtime(time.time())
        my_path = f'{s3_path}/{db_name}/{t.tm_year}-{t.tm_mon:02d}-{t.tm_mday:02d}/{t.tm_hour:02d}{t.tm_min:02d}{t.tm_sec:02d}/'
        logging.info(f'Backup Path : {my_path}')
        print(f'Backup Path : {my_path}')
        return my_path


def get_org_secret(lookup_key):
    """
    Get the organization secret using a lookup key.

    Parameters
    ----------
    lookup_key : str
        The lookup key for the organization secret.

    Returns
    -------
    str
        The organization secret.

    """
    return s2.manage_workspaces().org.get_secret(lookup_key).value


def get_sql_statement(db_name_to_bkp):
    """
    Get the SQL statement for backing up a database.

    Parameters
    ----------
    db_name_to_bkp : str
        The name of the database to backup.

    Returns
    -------
    str
        The SQL statement for backup.

    """
    aws_key_id = get_org_secret('BACKUP_APP_AWS_API_KEY_ID')
    aws_secret_key = get_org_secret('BACKUP_APP_AWS_API_SECRET_KEY')
    aws_region = input('BACKUP_APP_AWS_BUCKET_REGION')
    data = io.StringIO()
    data.write('BACKUP DATABASE ' + db_name_to_bkp + ' ')
    if is_incremental == 'Y':
        data.write(' WITH DIFFERENTIAL ')
    else:
        data.write(' WITH INIT ')
    data.write(' TO S3 "' + get_bkp_path(s3_target_path, db_name_to_bkp) + '" ')
    data.write(' CONFIG \' {"region":"' + aws_region + '"} \'')
    data.write(' CREDENTIALS \'{"aws_access_key_id":"' + aws_key_id
               + '","aws_secret_access_key":"' + aws_secret_key + '"}\' ')
    logging.debug(f'statement: {data.getvalue()}')
    return data.getvalue()


def perform_backup(my_cursor, curr_db_name):
    """
    Perform a database backup.

    Parameters
    ----------
    my_cursor : cursor
        The database cursor.
    curr_db_name : str
        The name of the database to backup.

    """
    logging.debug(f'backing up db {curr_db_name}')
    my_cursor.execute(get_sql_statement(curr_db_name))
    results = cursor.fetchall()
    if results is None:
        logging.error('Backup execution failed')
    else:
        logging.info("Backup completed")

In [6]:
# Check if the connection URL ends with a '/'
# If it does, display a warning message and exit
if connection_url.endswith('/'):
    show_warn('Database not selected. Please select from dropdown in top of web page')
else:
    try:
        # Disable logging for this section
        set_logging_enabled(False)

        # Establish a connection to the database
        conn = s2.connect(results_type='dict')
        with conn.cursor() as cursor:

            # If backup_all_databases is 'N', backup only the specified database
            if backup_all_databases == 'N':
                perform_backup(my_cursor=cursor, curr_db_name=database_to_bkp)
            # If backup_all_databases is 'Y', backup all databases except system databases
            else:
                # Get a list of databases to backup
                cursor.execute(
                    "SELECT schema_name FROM information_schema.schemata WHERE  schema_name NOT IN ( 'cluster', 'memsql', 'information_schema' );")
                for row in cursor.fetchall():
                    logging.debug(f"processing db {row['schema_name']}")
                    # Backup each database
                    perform_backup(my_cursor=cursor, curr_db_name=row['schema_name'])
                    logging.debug(f"processing db {row['schema_name']} complete")

        # Show success message if backup process completed successfully
        show_success('Backup Process Completed')
    except s2.exceptions.OperationalError as ope:
        # Handle specific operational errors
        if 'NoSuchBucket' in ope.errmsg:
            logging.error('Provided S3 Bucket does not exist. Please check.')
            show_error('Provided S3 Bucket does not exist. Please check.')
        elif 'Access denied' in ope.errmsg:
            logging.error('Failed to backup due to missing grants or firewall settings. Please check.')
            show_error('Failed to backup due to missing grants or firewall settings. Please check.')
        else:
            logging.error(f'Failed. Error message: {ope.errmsg}')
            show_error(f'Failed to backup. {ope.errmsg}')
    except s2.Error as e:
        # Handle any other errors
        print(f'Encountered exception {e}')
        show_error(f'Failed to backup. {str(e)}')

    print('\n\nScript execution completed')

### Verify Result

If script executed without errors. please check the S3 bucket for uploaded files ( Backup Path is printed to console )

General format is 'database_name.backup' or 'database_name.incr_backup'.

You may use below query to check backups created ( apply filter to limit data as per your needs )

    select * from information_schema.MV_BACKUP_HISTORY

**Important Note**

- To use this as scheduled notebook, we have to modify to read configuration data from table instead of user input

<div id="singlestore-footer" style="background-color: rgba(194, 193, 199, 0.25); height:2px; margin-bottom:10px"></div>
<div><img src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/singlestore-logo-grey.png" style="padding: 0px; margin: 0px; height: 24px"/></div>