## Local Producer

In this exercise, you'll run a producer on your computer to send some arbitrary files to a topic

In [1]:
# imports
import pathlib, logging, importlib, datetime
from threading import Thread
from openmsitoolbox.logging import OpenMSILogger
from openmsistream import UploadDataFile, DataFileUploadDirectory

In [2]:
# Configure a logger (only needed when running in a Jupyter notebook like this)
logger = OpenMSILogger("LocalProducer", filelevel=None)
importlib.reload(logging)

<module 'logging' from '/usr/local/anaconda3/envs/sensorpush/lib/python3.9/logging/__init__.py'>

In [3]:
# The name of the topic to work with
TOPIC_NAME = "test"

# Paths to the config file and the directory holding the test files
repo_root_dir = pathlib.Path().resolve().parent
CONFIG_FILE_PATH = repo_root_dir / "config_files" / "confluent_cloud_broker.config"
TEST_FILE_DIR = repo_root_dir.parent / "test_files"

### First, let's just call UploadDataFile for each file in the directory

This will start up a producer and send every chunk for each file

In [4]:
# For every file in the folder
for iuf, upload_file_path in enumerate(TEST_FILE_DIR.glob("*")):
    # Skip any hidden files (like .DS_Store....)
    if upload_file_path.name.startswith("."):
        continue
    # Create an UploadDataFile and call the function to upload it to the topic
    upload_file = UploadDataFile(upload_file_path, logger=logger)
    upload_file.upload_whole_file(CONFIG_FILE_PATH, TOPIC_NAME)

[LocalProducer 2023-12-08 12:40:19] Uploading /Users/margareteminizer/Desktop/dmref_materials_project/openmsistream_short_course/test_files/testing_1.txt to test in 524288-byte chunks using 2 threads....
[LocalProducer 2023-12-08 12:40:19] Waiting for all enqueued messages to be delivered (this may take a moment)....
[LocalProducer 2023-12-08 12:40:19] Done uploading /Users/margareteminizer/Desktop/dmref_materials_project/openmsistream_short_course/test_files/testing_1.txt
[LocalProducer 2023-12-08 12:40:19] Uploading /Users/margareteminizer/Desktop/dmref_materials_project/openmsistream_short_course/test_files/testing_2.txt to test in 524288-byte chunks using 2 threads....
[LocalProducer 2023-12-08 12:40:19] Waiting for all enqueued messages to be delivered (this may take a moment)....
[LocalProducer 2023-12-08 12:40:19] Done uploading /Users/margareteminizer/Desktop/dmref_materials_project/openmsistream_short_course/test_files/testing_2.txt
[LocalProducer 2023-12-08 12:40:20] Uploadin

### Next let's watch for new files in a folder using a DataFileUploadDirectory

You could run this as an interactive program from the command line, and type a command to shut it down when you wanted to, but here we'll run it in a separate thread from this notebook.

In [5]:
def upload_task(upload_directory, *args, **kwargs):
    """Run "upload_files_as_added" for a given DataFileUploadDirectory, and log a message
    when it gets shut down

    Args:
        upload_directory (DataFileUploadDirectory): the DataFileUploadDirectory to run
        args (list): passed through to "upload_files_as_added"
        kwargs (dict): passed through to "upload_files_as_added"

    Returns:
        None
    """
    start_time = datetime.datetime.now()
    # This call to "upload_files_as_added" waits until the program is shut down
    uploaded_filepaths = upload_directory.upload_files_as_added(*args, **kwargs)
    end_time = datetime.datetime.now()
    ts_format = "%m-%d-%Y %H:%M:%S"
    start_stamp = start_time.strftime(ts_format)
    end_stamp = end_time.strftime(ts_format)
    # Create a log a message stating the files that were uploaded during the run
    msg = (
        f"The following files were uploaded between {start_stamp} and {end_stamp}:\n\t"
    )
    msg += "\n\t".join([str(fp) for fp in uploaded_filepaths])
    upload_directory.logger.info(msg)

In [6]:
# Create the DataFileUploadDirectory
dfud = DataFileUploadDirectory(TEST_FILE_DIR, CONFIG_FILE_PATH, logger=logger)
# Start running its "upload_files_as_added" function in a separate thread
upload_thread = Thread(
    target=upload_task,
    args=(
        dfud,
        TOPIC_NAME,
    ),
)
upload_thread.start()

[LocalProducer 2023-12-08 12:41:11] Will upload new files added to/Users/margareteminizer/Desktop/dmref_materials_project/openmsistream_short_course/test_files to the test topic as 524288-byte chunks using 2 threads


#### With the above cell running, any files you move into the watched directory will be uploaded

In [7]:
# Manually shut down the upload directory (if running from the command line this would
# be like typing "q" in the Terminal window)
dfud.control_command_queue.put("q")
upload_thread.join()

[LocalProducer 2023-12-08 12:41:32] Will quit after all currently enqueued files are done being transferred.
[LocalProducer 2023-12-08 12:41:32] Waiting for all enqueued messages to be delivered (this may take a moment)
[LocalProducer 2023-12-08 12:41:33] The following files were uploaded between 12-08-2023 12:41:11 and 12-08-2023 12:41:33:
	testing_2 copy.txt
	testing_3 copy.txt
