Spark sentry, revisted #1400

mayhem · 2021-04-21T12:56:21Z

This PR replaces #1289 by adding an intermediate logging object that should make future BU improvements easier to implement. This PR also fixes merge conflicts.

…makes the code a little more future proof with some anticpated BU improvements.

alastair

Looks pretty good to me, just found a few places where we didn't roll back the method name properly. I'll make these changes myself

alastair · 2021-04-26T13:02:27Z

listenbrainz_spark/ftp/download.py


        mapping_file_name = self.get_latest_mapping(mapping)

        t0 = time.monotonic()
-        current_app.logger.info('Downloading {} from FTP...'.format(mapping_file_name))
+        logging.info('Downloading {} from FTP...'.format(mapping_file_name))


log instances in this file haven't been changed from logging -> logger

alastair · 2021-04-26T13:03:08Z

listenbrainz_spark/ftp/download.py

@@ -145,9 +147,9 @@ def download_listens(self, directory, listens_dump_id=None, dump_type=FULL):
                listens_file_name (str): name of downloaded listens dump.
                dump_id (int): Unique indentifier of downloaded listens dump.
        """
-        ftp_cwd = current_app.config['FTP_LISTENS_DIR'] + 'fullexport/'
+        ftp_cwd = config.FTP_LISTENS_DIR + 'fullexport/'


should use os.path.join to prevent issues with forgetting to add a / in the config file

alastair · 2021-04-26T13:13:06Z

listenbrainz_spark/recommendations/recording/recommend.py

-        current_app.logger.info('Some ratings are less than -1 \nMin rating: {}'.format(min_rating))
+        logger.info('Some ratings are less than -1 \nMin rating: {}'.format(min_rating))
+
+    return (max_rating > 1.0, min_rating < -1.0)


add to docstring the return type

alastair · 2021-04-26T13:14:49Z

listenbrainz_spark/request_consumer/request_consumer.py

@@ -93,15 +91,15 @@ def push_to_result_queue(self, messages):
            avg_size_of_message //= num_of_messages
        except ZeroDivisionError:
            avg_size_of_message = 0
-            current_app.logger.warn("No messages calculated", exc_info=True)
+            logging.warn("No messages calculated", exc_info=True)


warn is deprecated, replace with warning

this uses logging, replace with logger

alastair · 2021-04-26T13:22:50Z

spark-submit.sh

@@ -14,5 +14,7 @@ time ./run.sh /usr/local/spark/bin/spark-submit \
    --conf "spark.driver.memoryOverhead"=$DRIVER_MEMORY_OVERHEAD \
    --conf "spark.executor.memoryOverhead"=$EXECUTOR_MEMORY_OVERHEAD \
 	--conf "spark.driver.maxResultSize"=$DRIVER_MAX_RESULT_SIZE \
+	--conf "spark.python.use.daemon"=true \
+  --conf "spark.python.daemon.module"=sentry_daemon \


indentation in this file is all over the place, we can fix it up

alastair

Fine by me!

amCap1712 and others added 6 commits February 25, 2021 16:59

Add Pyspark Integration to sentry

44176cb

Remove flask dependencies

100c7b5

Guard sentry_sdk initialization

b5931be

Create a logger object, rather than using the logging module -- this …

348e7b0

…makes the code a little more future proof with some anticpated BU improvements.

Merge branch 'master' into spark-sentry

707c3fb

Merge remote-tracking branch 'origin/master' into spark-sentry

43966f8

alastair requested changes Apr 26, 2021

View reviewed changes

alastair added 2 commits April 26, 2021 15:36

Clean up indentation

371b31d

Replace some unchanged instances of logging with logger

1199bf5

alastair approved these changes Apr 26, 2021

View reviewed changes

amCap1712 merged commit f83ce97 into master Apr 26, 2021

amCap1712 deleted the spark-sentry branch April 26, 2021 13:42

amCap1712 mentioned this pull request May 1, 2021

Fix spark logging and spark tests action workflow #1413

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark sentry, revisted #1400

Spark sentry, revisted #1400

mayhem commented Apr 21, 2021

alastair left a comment

alastair Apr 26, 2021

alastair Apr 26, 2021

alastair Apr 26, 2021

alastair Apr 26, 2021

alastair Apr 26, 2021

alastair Apr 26, 2021

alastair left a comment

Spark sentry, revisted #1400

Spark sentry, revisted #1400

Conversation

mayhem commented Apr 21, 2021

alastair left a comment

Choose a reason for hiding this comment

alastair Apr 26, 2021

Choose a reason for hiding this comment

alastair Apr 26, 2021

Choose a reason for hiding this comment

alastair Apr 26, 2021

Choose a reason for hiding this comment

alastair Apr 26, 2021

Choose a reason for hiding this comment

alastair Apr 26, 2021

Choose a reason for hiding this comment

alastair Apr 26, 2021

Choose a reason for hiding this comment

alastair left a comment

Choose a reason for hiding this comment