Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add automatic periodic check of URLs in jobs #483

Closed
wants to merge 2 commits into from

Conversation

brain-geek
Copy link
Contributor

No description provided.

lib/companies/jobs.ex Outdated Show resolved Hide resolved
Copy link
Collaborator

@notactuallypagemcconnell notactuallypagemcconnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think quantum is overkill. We don't need to add a dependency for a recurring task this simple.

We have a blog post that goes over the basic idea, it http://elixirschool.com/blog/til-send-after/ it references this bit from the GenServer docs

    defmodule MyApp.Periodically do
      use GenServer

      def start_link(_) do
        GenServer.start_link(__MODULE__, %{})
      end

      @impl true
      def init(state) do
        # Schedule work to be performed on start
        schedule_work()

        {:ok, state}
      end

      @impl true
      def handle_info(:work, state) do
        # Do the desired work here
        # so check urls whathaveyou with the timeout

        # Reschedule once more
        schedule_work()

        {:noreply, state}
      end

      defp schedule_work do
        # In 2 hours
        Process.send_after(self(), :work, 2 * 60 * 60 * 1000)
      end
    end

Thoughts?

@brain-geek
Copy link
Contributor Author

brain-geek commented Mar 7, 2019

@notactuallypagemcconnell yeah, sure, the only problem I don't like in send_after approach is that I have no idea how to test the recurring part.

Any ideas on that?

@notactuallypagemcconnell
Copy link
Collaborator

@brain-geek yes I will push up a commit with an example sometime today

@notactuallypagemcconnell
Copy link
Collaborator

@brain-geek heres an example, I just will paste here since this branch is on your fork and I'm lazy.

# lib/companies/periodic.ex
defmodule Companies.Periodic do
  use GenServer
  require Logger

  def start_link({_mod, _func, _argument_list, _interval} = state) do
    GenServer.start_link(__MODULE__, state)
  end

  @impl true
  def init({_mod, _func, _argument_list, interval} = state) do
    schedule_work(interval)

    {:ok, state}
  end

  @impl true
  def handle_info(:work, {mod, func, argument_list, interval} = state) do
    apply(mod, func, argument_list)
    schedule_work(interval)

    {:noreply, state}
  end

  defp schedule_work(interval) do
    Process.send_after(self(), :work, interval)
  end
end

Code to run off {mod, func, args, interval} where func is applied to mod using args in interval

# test/companies/periodic_test.exs
defmodule Companies.PeriodicTest do
  use ExUnit.Case

  import ExUnit.CaptureIO

  require Logger

  test "it can run a given module's given function with the given arguments in the given interval" do
    assert capture_io(fn ->
             {:ok, _pid} = Companies.Periodic.start_link({IO, :puts, ["hi"], 1})
             :timer.sleep(5)
           end) =~ "hi\nhi"
  end
end

This test proves that it runs the task twice and has minimal overhead (3 milliseconds spent sleeping) but proves the premise is sound and tests it at an OTP level. I think it works well enough. We use some code similar to this (but with some more sophistication added) at work.

@brain-geek brain-geek force-pushed the check_urls branch 2 times, most recently from 79372ab to 7240f37 Compare March 7, 2019 21:54
@brain-geek
Copy link
Contributor Author

@notactuallypagemcconnell Thanks for the snippet, that's a clever way to test it! Applied.

Copy link
Member

@doomspork doomspork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very exciting @brain-geek!

lib/companies/job_checker.ex Outdated Show resolved Hide resolved
lib/companies/job_checker.ex Outdated Show resolved Hide resolved
lib/companies/jobs.ex Outdated Show resolved Hide resolved
@brain-geek
Copy link
Contributor Author

@doomspork maybe, but then we need to mark this automatic user somehow.

lib/companies/scheduler.ex Outdated Show resolved Hide resolved
config/config.exs Outdated Show resolved Hide resolved
@lcezermf
Copy link
Contributor

Recently I found a blog post that may help with the implementation:

@gemantzu @brain-geek

@brain-geek brain-geek force-pushed the check_urls branch 2 times, most recently from 865117b to d696410 Compare December 20, 2019 21:17
@brain-geek
Copy link
Contributor Author

@lccezinha Thanks for the suggestion, reworked it according to your suggestion.

@brain-geek brain-geek marked this pull request as ready for review December 21, 2019 11:47
@brain-geek brain-geek requested a review from a team as a code owner December 21, 2019 11:47
@brain-geek
Copy link
Contributor Author

@doomspork Could you check out the new version? It's now creates pending changes request instead of deleting something, so it should be more reliable. 😄

@brain-geek brain-geek force-pushed the check_urls branch 2 times, most recently from f4b804d to 64436d6 Compare December 21, 2019 12:03
def check_job(%{url: url} = job) do
case HTTPoison.head(url) do
{:ok, %{status_code: 404}} ->
user_id = Application.get_env(:companies, :jobs_url_checker)[:user_id]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Application.get_env(:companies, :jobs_url_checker)[:user_id] resolves to System.get_env("JOB_CHECKER_USER_ID") in a production environment, and nil in other environments. Would it be simpler to read from the environment variable here and omit setting user_id in the configuration file?

user_id = System.get_env("JOB_CHECKER_USER_ID")

That has a slight potential benefit of allowing you to change the value without restarting the app, although I'm not sure if that will make a practical difference. It also simplifies the application logic a little bit. A minor downside is that a change like this splits the configuration up between the configuration file and the application code. It may also be detrimental to use the JOB_CHECKER_USER_ID environment variable in non-production environments if that value could be non-empty, and you wanted to ensure user_id was nil in those environments.

Just some food for thought 😄

where: fragment("changes->>'id' = ?", ^id)

Repo.all(query)
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use case for this function is to determine if any pending changes exist. The place that calls this function isn't really interested in the actual list of changes. What would you think about converting this to a predicate function?

def any?(record) do
  # ...

  Repo.exists?(query)
end

Companies.JobChecker.check_job/1 could then simplify its conditional like this:

if PendingChanges.any?(record) do 
  Jobs.delete(job, %{id: user_id})
end

@@ -71,6 +71,19 @@ defmodule Companies.PendingChanges do
end
end

def get_pending_changes_for(record) do
resource_name = struct_to_string(record)
id = record.id |> to_string()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's only one pipe here, what would you think about using the conventional function syntax?

id = to_string(record.id)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the preferred way given the styleguide we try to follow. We may need to update our Credo (or add it) rules for some of these.

Additionally I'd likely capture the record_id value via a match and avoid dot notation (I don't know why, I don't much care for it personally).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants