Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduled job executing more than 1 time #476

Closed
TallGibbs opened this issue Jan 25, 2024 · 27 comments
Closed

Scheduled job executing more than 1 time #476

TallGibbs opened this issue Jan 25, 2024 · 27 comments
Labels
bug Something isn't working

Comments

@TallGibbs
Copy link

Description

I scheduled a script to run at 12:00 PM on each weekday. When the next scheduled time came around, the job appears to have executed 3 different times with their run times overlapping each other. I noticed this because my script is setup to have an automatic email sent and I received 3 copies of the email all with different results from executing my script. Then, I checked the jupyter lab "notebook job" tab and saw 3 results of "completed" for the same script. However in my "notebook job definitions" area, the script was only scheduled 1 time.

I'm wondering if the environment selected to run the job in affected this. I scheduled it to run in the "base" environment (anaconda3) but I actually built the script in "projects" environment. Also, I had previously scheduled this job, but then made revisions to the script. So I deleted the scheduled job, and then scheduled it again. Could this have affected the results?

image

Here is my revised scheduled job (sorry, this screenshot is taken from AFTER the original issue showed up and I made a new schedule before opening this ticket).
image

Reproduce

I'm not actually sure how to reproduce the issue. All I did was schedule the script to run and this issue happened.

Expected behavior

I wanted the job to only run 1 time at the scheduled time. My script includes an automatic email sent out and I expected to only see 1 email and 1 completed job in the "completed job" tab.

Context

  • Operating System and version:
    Brower = Chrome
  • Jupyter Server version: 2.10.0
  • Jupyter Lab version: 4.0.8
  • Jupyter Notebook version: 7.0.6
Troubleshoot Output
Paste the output from running `jupyter troubleshoot` from the command line here.
You may want to sanitize the paths in the output.

(I tried to paste my entire result but I got an error that body is too long). I have manually trimmed some info out of this result.

(base) C:>jupyter troubleshoot
$PATH:

sys.path:
C:\Users\XXXXX\AppData\Local\anaconda3\Scripts
C:\Users\XXXXX\AppData\Local\anaconda3\python311.zip
C:\Users\XXXXX\AppData\Local\anaconda3\DLLs
C:\Users\XXXXX\AppData\Local\anaconda3\Lib
C:\Users\XXXXX\AppData\Local\anaconda3
C:\Users\XXXXX\AppData\Roaming\Python\Python311\site-packages
C:\Users\XXXXX\AppData\Roaming\Python\Python311\site-packages\win32
C:\Users\XXXXX\AppData\Roaming\Python\Python311\site-packages\win32\lib
C:\Users\XXXXX\AppData\Roaming\Python\Python311\site-packages\Pythonwin
C:\Users\XXXXX\AppData\Local\anaconda3\Lib\site-packages
C:\Users\XXXXX\AppData\Local\anaconda3\Lib\site-packages\win32
C:\Users\XXXXX\AppData\Local\anaconda3\Lib\site-packages\win32\lib
C:\Users\XXXXX\AppData\Local\anaconda3\Lib\site-packages\Pythonwin

sys.executable:
C:\Users\XXXXX\AppData\Local\anaconda3\python.exe

sys.version:
3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:26:23) [MSC v.1916 64 bit (AMD64)]

platform.platform():
Windows-10-10.0.19045-SP0

where jupyter:
C:\Users\XXXXX\AppData\Local\anaconda3\Scripts\jupyter.exe
Jupyter Troubleshoot Results.docx

@TallGibbs TallGibbs added the bug Something isn't working label Jan 25, 2024
Copy link

welcome bot commented Jan 25, 2024

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@TallGibbs
Copy link
Author

This job ran from its daily schedule and now I received 4 emails and it appears the job executed 4 times today (1 more time than yesterday). Is this going to continue to increase each day by 1?

Here you can see the notebook jobs completed:
image

Here is the current scheduled job:
image

@andrii-i
Copy link
Collaborator

Hi @TallGibbs. Thank you for opening this issue and proactively providing jupyter troubleshoot output, this is very helpful.

Could you please send a screenshot of "Run on a schedule" section of the Create Job screen for the job that this happened/happening with? For the sake of reproduction I'd like to understand if you define the schedule via Interval > Day or Interval > Custom schedule > cron expression option.

Screenshot 2024-01-25 at 11 18 26 AM

@TallGibbs
Copy link
Author

Here you go!

Creating the schedule:
image

Final output of the scheduled job from the "Job Definition" page:
image

@andrii-i
Copy link
Collaborator

Thank you for the details @TallGibbs. I was not able to reproduce this through clock manipulation, trying to reproduce "organically" by waiting 12 PM.

@TallGibbs
Copy link
Author

@andrii-i Today the job completed two times. However, yesterday afternoon I re-installed jupyter notebook version 6.5.4 through Anaconda Navigator GUI, then updated (back) to 7.0.6 and then deleted all my jobs in Jupyter lab. Then I scheduled this job again and now it resulted in the job running twice:

image

Here is my scheduled jobs:
image

@TallGibbs
Copy link
Author

I'm not going to touch anything today on the job scheduler and see what happens on Saturday, Sunday and Monday. It should NOT run on the weekend and then should run again on Monday.

@JasonWeill
Copy link
Collaborator

I set up a daily job run on my Windows 10 machine that has run exactly once per day, every day, since I first configured it.

If you run jupyter server list on a command prompt in your same Conda environment, how many entries do you see under "Currently running servers"? If you have multiple Jupyter Server instances running, that might be related to this.

@TallGibbs
Copy link
Author

Here is the update for today: on Saturday and Sunday the job did NOT run (which is a good thing; only scheduled Mon through Fri). Today though (Monday), the job ran 3 times which is once more than what happened on Friday.

Somehow the scheduler is running the job once more each time? Is there any way that my code can be causing this? I'm assuming NO because I see on Jupyter Lab's "Notebook jobs" tab indicating the whole job ran (and not that my code just looped through it).

Here is the job's result from today:
image

@TallGibbs
Copy link
Author

I'm going to leave everything the same for tomorrow and see if it run 4 times. Then, I'm going to make a new schedule that is NOT at 12:00 PM and see if that is related to the issue. @andrii-i

@JasonWeill
Copy link
Collaborator

Thanks! For what it's worth, my schedule was at 8:00 am, and the run time typically started within 10 seconds of the hour.

@TallGibbs
Copy link
Author

I set up a daily job run on my Windows 10 machine that has run exactly once per day, every day, since I first configured it.

If you run jupyter server list on a command prompt in your same Conda environment, how many entries do you see under "Currently running servers"? If you have multiple Jupyter Server instances running, that might be related to this.

I just ran that code on my activated environment and I see 5 servers running (I'm not sure if it is safe to post screenshot of full server addresses). It may be confusing though because I also currently have Jupyter Notebook open with some code I am working on, while also running Jupyter Lab.

I'm guessing a better test it to close everything down, then open just an Anaconda Prompt and run that code (after activating my environment)?

@JasonWeill
Copy link
Collaborator

If you have multiple Jupyter Servers open with the Jupyter Scheduler server extension running, that might be related to the behavior you're seeing. If you keep only one such server running, does that fix the problem?

(This may be an enhancement opportunity to run jobs only once, even if multiple servers are running.)

@TallGibbs
Copy link
Author

If you have multiple Jupyter Servers open with the Jupyter Scheduler server extension running, that might be related to the behavior you're seeing. If you keep only one such server running, does that fix the problem?

(This may be an enhancement opportunity to run jobs only once, even if multiple servers are running.)

What do you think is most valuable for testing/documentation purposes? I see 3 options I could do right now.

Option 1 = Do nothing, and see if script runs 1 additional time tomorrow at noon.
Option 2 = Close all jupyter servers, then proceed with essentially option 1 (let scheduled job run at Noon).
Option 3 = Close all jupyter servers, make a new scheduled job that runs today at 3:00 PM and see what happens

@andrii-i
Copy link
Collaborator

andrii-i commented Jan 29, 2024

@TallGibbs please try option 2. Please make sure you have only 1 instance of the jupyter server with jupyter_scheduler extension installed running and see if the duplication goes away. I would expect this to solve the problem. I tried the same schedule on windows with 1 instance of the jupyter server running and every job is created only once:
no_duplicates

@TallGibbs
Copy link
Author

@andrii-i Ok, I physically closed all open tabs with anything Jupyter related and then opened an Anaconda Prompt and ran the command: jupyter server list

Here are the results from that:
image

It appears there are 3 servers still running even though I do not physically have any open browsers related to Jupyter. So, to try and shut them down I ran this command: jupyter notebook stop 8888

image

I get the response "Could not stop server on 8888" within my Anaconda Prompt. Same response for all 3 servers when I try the different ports. It makes me think there is some conflicts between my currently installed Anaconda prompt and previous versions? I had to uninstall and reinstall several times in the past few weeks due to some package conflicts.

@TallGibbs
Copy link
Author

@andrii-i sorry for the additional post: it looks like I had to completely close Anaconda Navigator and now I show 0 servers running.

image

@TallGibbs
Copy link
Author

Today's update: with all the servers shutdown the scheduled job did not run at all. Is that the expected response? Is there a way to execute a scheduled job without a server running?

Now that we verified no job ran, with no servers running, I'm going to schedule the file to run this afternoon and see what happens.

@TallGibbs
Copy link
Author

Looks good today, job ran successfully and only ran 1 time. Now, I will leave it untouched through tomorrow's scheduled job and verify it still only runs once.

image

@andrii-i
Copy link
Collaborator

andrii-i commented Jan 30, 2024

@TallGibbs thank you for testing and confirming that there is no duplication when single server is running.

Today's update: with all the servers shutdown the scheduled job did not run at all. Is that the expected response? Is there a way to execute a scheduled job without a server running?

It is an expected response. Jupyter Scheduler is composed of jupyter_scheduler server extension and @jupyterlab/scheduler lab extension. jupyter_scheduler server extension owns and manages job execution. So with no instances of server running there would be no jobs created, with multiple instances of the server running there would be multiple jobs created.

Looks good today, job ran successfully and only ran 1 time. Now, I will leave it untouched through tomorrow's scheduled job and verify it still only runs once.

Sounds good. Please let us know if problem goes away.

@dlqqq
Copy link
Collaborator

dlqqq commented Jan 30, 2024

It seems like your issue has been resolved, so I'll close this for now. Feel free to continue discussion if needed however! 👋

@dlqqq dlqqq closed this as completed Jan 30, 2024
@TallGibbs
Copy link
Author

Thank you everyone for all the help.

I'm not sure this is directly the right place for this question, but thought I would ask anyways even if just for a redirect: is there a way for me to have a jupyter server session running without Anaconda Navigator open? We are trying to find a replacement for Alteryx Server running scheduled jobs (at my company) and I'm hoping I can find a way with Jupyter where a schedule job can run even when my work computer is shutdown. I'm not an expert in computer networks/servers (clearly), so I am kind of figuring this out as I go...

@JasonWeill
Copy link
Collaborator

I run JupyterLab by running jupyter lab from an Anaconda-enabled command prompt, and I haven't run Anaconda Navigator at all. You'll need to have a server running all the time, ideally something running in a data center or in a hosted environment (disclosure: I work for AWS), that stays up so that jobs get rerun.

@TallGibbs
Copy link
Author

@JasonWeill awesome info, thank you that is helpful and gives me some ideas.

@TallGibbs
Copy link
Author

Another follow-up closely related to this issue. Is this normal that using the command "jupyter server stop 8888" gives back the response "Could not stop server on 8888" but then when you run "jupyter server list" the server on port 8888 no longer shows running? Seems like a flaw in the code when it tells me "could not stop server on 8888."

Here is my screenshot of this exact sequence:
image

@JasonWeill
Copy link
Collaborator

Jupyter Server has a separate issue queue: https://github.com/jupyter-server/jupyter_server/issues

@andrii-i
Copy link
Collaborator

andrii-i commented Feb 1, 2024

@TallGibbs below are some good places to ask questions if you don't want to create an issue:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants