Keeping pipelines full #10225

goliaro · 2020-08-20T17:15:47Z

Why are these changes needed?

These changes are needed to avoid over-requesting new workers when using pipelining to submit tasks from owners to their workers. When a new task is submitted to an owner, the code first tries to send the task to an existing worker if the number of tasks in flight to that worker is less than the maximum allowed by the pipelining settings. If all pipelines to all workers are full, then it requests a new worker.

Related issue number

Checks

[ x] I've run scripts/format.sh to lint the changes in this PR.
[ x] I've included any doc changes needed for https://docs.ray.io/en/latest/.
[ x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failure rates at https://ray-travis-tracker.herokuapp.com/.
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested (please justify below)

stephanie-wang · 2020-08-20T21:52:40Z

src/ray/core_worker/transport/direct_task_transport.h

+  // (1) how many worker leases have been granted to execute tasks with
+  //     the current SchedulingKey
+  // (2) how many tasks are in flight to all the workers from (1)
+  absl::flat_hash_map<SchedulingKey, std::pair<uint32_t, uint32_t>> worker_info_


Discussed offline about refactoring this class to squash all of the fields that are currently related to SchedulingKey into one hashmap. The value should be a new struct that includes:

these new counts

the workers that are currently leased to us with that scheduling key (worker_to_lease_entry_)

whether there is a pending lease request for that key (pending_lease_requests_)

the tasks queued with that key (task_queues_)

stephanie-wang · 2020-08-20T21:52:58Z

I'll take another look once the refactoring and unit tests are done!

…chedulingKey into a single hashmap

stephanie-wang · 2020-08-24T22:48:51Z

src/ray/core_worker/transport/direct_task_transport.cc

-        it->second.push_back(task_spec);
+        auto &it = scheduling_key_entries_[scheduling_key];
+        it.task_queue_.push_back(task_spec);
+        RAY_CHECK(it.task_queue_.size() >= 1);


This check doesn't seem necessary.

stephanie-wang · 2020-08-24T22:57:27Z

src/ray/core_worker/transport/direct_task_transport.cc

    // We don't have any of this type of task to run.
    return;
  }

+  // If pipelining is enabled, check whether we really need a new worer or whether we have
+  // enough room in an existing worker's pipeline to send the new tasks
+  if (max_tasks_in_flight_per_worker_ > 1) {


I don't think we need this check because it should be okay to run the following code even when max_tasks_in_flight = 1.

stephanie-wang · 2020-08-24T23:04:09Z

src/ray/core_worker/transport/direct_task_transport.cc

+  // enough room in an existing worker's pipeline to send the new tasks
+  if (max_tasks_in_flight_per_worker_ > 1) {
+    if (scheduling_key_entry.tot_tasks_in_flight <
+        scheduling_key_entry.active_workers_.size() * max_tasks_in_flight_per_worker_) {


I'm not sure about putting this logic in RequestNewWorkerIfNeeded. It seems a bit strange to be submitting tasks in this method (before the logic only had to do with requesting new workers and not anything to do with task submission). It also seems a bit brittle because OnWorkerIdle, called below, calls back into RequestNewWorkerIfNeeded.

Instead, how about we only use the check to see whether we should request a new worker or not? Then, we should also move this new logic to submit tasks to already active workers to when the task is first queued in SubmitTask.

The reason why I added this part (calling OnWorkerIdle from within RequestNewWorkerIfNeeded) was to avoid introducing new latency due to the fact that OnWorkerIdle works in a pull (rather than push) fashion. OnWorkerIdle is normally only called when (1) we get a new worker, or (2) when we get a response from a worker (after the worker has completed the execution of a task). At that point, we pull tasks from the queue and submit them.

Now, consider a scenario with a set of active workers with non-full pipelines, where we have just added a few more tasks to a queue at the owner's. In this situation, even if some of the pipelines are not full, before we can submit the new tasks to an active worker, we have to wait until that worker has responded back to the owner (because OnWorkerIdle is not called until then). If we submit to a queue at the owner's a number of tasks that is larger than the grand total of existing "spots" available in the pipelines to the existing worker, the system will only fill those pipelines (and realize that it needs to request an additional worker) only after it has received a response from every single one of the existing workers with non-full pipelines. As a result, the total number of active workers will be lower for a longer period of time, and the overall execution time will suffer.

Yes, I understand that we should call OnWorkerIdle even while it is still executing other tasks. My comment was more about where we should call it. I think it is better to call it directly in SubmitTask right after we've added tasks to the queue. The reason is that: a) dispatching tasks doesn't match the current semantics of the method, which is supposed to only request a new worker, and b) it prevents the bad recursive structure where RequestNewWorkerIfNeeded calls OnWorkerIdle calls RequestNewWorkerIfNeeded, etc.

Oh I see! I think I misread your initial comment! Sorry about that. This makes sense. So I guess we would just move the code block

if (scheduling_key_entry.tot_tasks_in_flight < scheduling_key_entry.active_workers_.size() * max_tasks_in_flight_per_worker_) { // The pipelines to the current workers are not full yet, so we don't need more // workers. // Find a worker with a number of tasks in flight that is less than the maximum // value (max_tasks_in_flight_per_worker_) and call OnWorkerIdle to send tasks to // that worker for (auto active_worker_addr : scheduling_key_entry.active_workers_) { RAY_CHECK(worker_to_lease_entry_.find(active_worker_addr) != worker_to_lease_entry_.end()); auto &lease_entry = worker_to_lease_entry_[active_worker_addr]; if (lease_entry.tasks_in_flight_ < max_tasks_in_flight_per_worker_) { OnWorkerIdle(active_worker_addr, scheduling_key, false, lease_entry.assigned_resources_); break; } } return; }

(without the return statement, of course) to the SubmitTask function. RequestNewWorkerIfNeeded would then only keep the following if statement?

if (scheduling_key_entry.tot_tasks_in_flight < scheduling_key_entry.active_workers_.size() * max_tasks_in_flight_per_worker_) { // The pipelines to the current workers are not full yet, so we don't need more // workers. return; }

Yes, exactly!

Sounds good! Let me do that right now.

stephanie-wang · 2020-08-24T23:05:22Z

src/ray/core_worker/transport/direct_task_transport.cc

+        if (lease_entry.tasks_in_flight_ < max_tasks_in_flight_per_worker_) {
+          OnWorkerIdle(active_worker_addr, scheduling_key, false,
+                       lease_entry.assigned_resources_);
+          break;


I'm not sure if we want to break here. Couldn't there be multiple idle workers that could get filled by new tasks? But I guess this depends on where we decide to put this logic (see above comment).

I see! So I guess we would remove the break statement if we put this in SubmitTask, right?

Hmm I need to think about that. It seems like we could structure the code so that we're always guaranteed that only one worker needs to be filled up during SubmitTask, but I'm not sure.

stephanie-wang · 2020-08-24T23:06:14Z

src/ray/core_worker/transport/direct_task_transport.cc

+        auto lease_client = std::move(pending_lease_request.first);
+        const auto task_id = pending_lease_request.second;
+        pending_lease_request = std::make_pair(nullptr, TaskID::Nil());
+        RAY_CHECK(lease_client);


I don't think this check is necessary (std::move should guarantee this already).

stephanie-wang · 2020-08-24T23:07:41Z

src/ray/core_worker/transport/direct_task_transport.cc

-                .emplace(scheduling_key, std::make_pair(lease_client, task_id))
-                .second);
+  pending_lease_request = std::make_pair(lease_client, task_id);
+  RAY_CHECK(pending_lease_request.first);


I don't think this check is necessary. The previous check was just to make sure that there wasn't already a pending lease request for the same scheduling key (arguably also not necessary).

stephanie-wang · 2020-08-24T23:09:47Z

src/ray/core_worker/transport/direct_task_transport.cc

@@ -399,6 +469,22 @@ Status CoreWorkerDirectTaskSubmitter::CancelTask(TaskSpecification task_spec,
            cancel_retry_timer_->async_wait(boost::bind(
                &CoreWorkerDirectTaskSubmitter::CancelTask, this, task_spec, force_kill));
          }
+        } else if (status.ok() && reply.attempt_succeeded()) {


I don't think this is necessary because we should still get back the callback for PushNormalTask.

stephanie-wang · 2020-08-24T23:10:27Z

src/ray/core_worker/transport/direct_task_transport.h

+    std::deque<TaskSpecification> task_queue_ = std::deque<TaskSpecification>();
+    // Keep track of the active workers, so that we can quickly check if one of them has
+    // room for more tasks in flight
+    absl::flat_hash_set<rpc::WorkerAddress> active_workers_ =


Consider making this a hashmap from rpc::WorkerAddress -> LeaseEntry instead of keeping a separate hashmap.

Doesn't worker_to_lease_entry_ already have this mapping? Do you mean that I should just move worker_to_lease_entry_ into the SchedulingKeyEntry struct, so that each SchedulingKey will be paired to its own worker_to_lease_entry_ hashmap?

stephanie-wang · 2020-08-25T01:14:56Z

Hmm actually I think I'm just wrong about this one haha. Ignore that!

…

On Mon, Aug 24, 2020, 5:54 PM Gabriele Oliaro ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/ray/core_worker/transport/direct_task_transport.cc <#10225 (comment)>: > + if (scheduling_key_entry.tot_tasks_in_flight < + scheduling_key_entry.active_workers_.size() * max_tasks_in_flight_per_worker_) { + // The pipelines to the current workers are not full yet, so we don't need more + // workers. + + // Find a worker with a number of tasks in flight that is less than the maximum + // value (max_tasks_in_flight_per_worker_) and call OnWorkerIdle to send tasks to + // that worker + for (auto active_worker_addr : scheduling_key_entry.active_workers_) { + RAY_CHECK(worker_to_lease_entry_.find(active_worker_addr) != + worker_to_lease_entry_.end()); + auto &lease_entry = worker_to_lease_entry_[active_worker_addr]; + if (lease_entry.tasks_in_flight_ < max_tasks_in_flight_per_worker_) { + OnWorkerIdle(active_worker_addr, scheduling_key, false, + lease_entry.assigned_resources_); + break; I see! So I guess we would remove the break statement if we put this in SubmitTask, right? — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#10225 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATREBEC6MI2XQLEOBBOARLSCMDUHANCNFSM4QGMOAUQ> .

stephanie-wang · 2020-08-25T01:17:16Z

Yes! I'm not sure which would be cleaner. I'll leave that up to you.

…

On Mon, Aug 24, 2020, 6:04 PM Gabriele Oliaro ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/ray/core_worker/transport/direct_task_transport.h <#10225 (comment)>: > }; // Map from worker address to a LeaseEntry struct containing the lease's metadata. absl::flat_hash_map<rpc::WorkerAddress, LeaseEntry> worker_to_lease_entry_ GUARDED_BY(mu_); - // Keeps track of pending worker lease requests to the raylet. - absl::flat_hash_map<SchedulingKey, - std::pair<std::shared_ptr<WorkerLeaseInterface>, TaskID>> - pending_lease_requests_ GUARDED_BY(mu_); + struct SchedulingKeyEntry { + // Keep track of pending worker lease requests to the raylet. + std::pair<std::shared_ptr<WorkerLeaseInterface>, TaskID> pending_lease_request_ = + std::make_pair(nullptr, TaskID::Nil()); + // Tasks that are queued for execution. We keep an individual queue per + // scheduling class to ensure fairness. + std::deque<TaskSpecification> task_queue_ = std::deque<TaskSpecification>(); + // Keep track of the active workers, so that we can quickly check if one of them has + // room for more tasks in flight + absl::flat_hash_set<rpc::WorkerAddress> active_workers_ = Doesn't worker_to_lease_entry_ already have this mapping? Do you mean that I should just move worker_to_lease_entry_ into the SchedulingKeyEntry struct, so that each SchedulingKey will be paired to its own worker_to_lease_entry_ hashmap? — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#10225 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AATREBFNUKZY5ARBLPPKHITSCMEY7ANCNFSM4QGMOAUQ> .

goliaro · 2020-08-25T01:18:49Z

@stephanie-wang Ok, I just pushed the updated code :)

src/ray/core_worker/transport/direct_task_transport.cc

Co-authored-by: fangfengbin <869218239a@zju.edu.cn>

stephanie-wang

A few more comments :)

stephanie-wang · 2020-08-25T01:52:41Z

src/ray/core_worker/transport/direct_task_transport.cc

-              task_queues_.emplace(scheduling_key, std::deque<TaskSpecification>()).first;
+        auto &scheduling_key_entry = scheduling_key_entries_[scheduling_key];
+        scheduling_key_entry.task_queue_.push_back(task_spec);
+        if (scheduling_key_entry.tot_tasks_in_flight <


Maybe we should make this a while loop? And yeah, I think it's not correct to break after only one worker. Could you add this case to the unit test (i.e. check that we don't request a new worker if there are multiple workers that could be filled from the owner's queue). Thanks!

Hmm okay now that I'm thinking about this again, I think it is okay to have the break statement and only dispatch to one worker! Sorry about that :)

Could you add a comment explaining why it's okay to break after one worker?

Sounds good, let me add that right now!

stephanie-wang · 2020-08-25T01:54:43Z

src/ray/core_worker/transport/direct_task_transport.cc

-      task_queues_.erase(queue_entry);
-      RAY_LOG(DEBUG) << "Task queue empty, canceling lease request";
+    if (current_queue.empty()) {
+      RAY_LOG(INFO) << "Task queue empty, canceling lease request";


Shouldn't we still attempt to delete the entry here? It'd be good to add a method to the new SchedulingKeyEntry struct to check whether it's safe to delete the entry (i.e. if the task queue is empty, no pending request, etc).

I just added the new method! However, I am not sure about attempting to delete in this place in the code, because there would still be some worker in the active_workers set, so calling the new method would tell us that it's not safe to delete the entry yet

Ahh gotcha, makes sense!

stephanie-wang · 2020-08-25T01:56:16Z

src/ray/core_worker/transport/direct_task_transport.cc

+    // The pipelines to the current workers are not full yet, so we don't need more
+    // workers.
+
+    // Find a worker with a number of tasks in flight that is less than the maximum


Could you remove this comment?

src/ray/core_worker/transport/direct_task_transport.cc

stephanie-wang · 2020-08-25T01:58:23Z

src/ray/core_worker/transport/direct_task_transport.cc

@@ -377,6 +449,7 @@ Status CoreWorkerDirectTaskSubmitter::CancelTask(TaskSpecification task_spec,
      return Status::OK();
    }
    client = maybe_client.value();
+    client_addr = rpc_client->second.ToProto();


I don't think we need this variable anymore.

stephanie-wang · 2020-08-25T01:59:11Z

src/ray/core_worker/transport/direct_task_transport.h

+    absl::flat_hash_set<rpc::WorkerAddress> active_workers_ =
+        absl::flat_hash_set<rpc::WorkerAddress>();
+    // Keep track of how many tasks with this SchedulingKey are in flight, in total
+    uint32_t tot_tasks_in_flight = 0;


Suggested change

uint32_t tot_tasks_in_flight = 0;

uint32_t total_tasks_in_flight = 0;

We try to use complete words for variable naming in most cases.

stephanie-wang · 2020-08-25T02:02:19Z

src/ray/core_worker/transport/direct_task_transport.h

-      pending_lease_requests_ GUARDED_BY(mu_);
+  struct SchedulingKeyEntry {
+    // Keep track of pending worker lease requests to the raylet.
+    std::pair<std::shared_ptr<WorkerLeaseInterface>, TaskID> pending_lease_request_ =


Could you remove the tail underscore at the end of these field names? Usually we do this only for private class members (see Google style guide). Realizing now that the above LeaseEntry struct doesn't follow this convention, oops :)

Should I also remove the underscores from the field names in the LeaseEntry struct? I guess it's better late than never, right?

That would be great, thanks! :)

Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>

goliaro · 2020-08-25T21:32:20Z

Ok, I just pushed the updated code!

stephanie-wang

Thanks, this is looking very close! I just had some questions about making sure we delete the SchedulingKeyEntry properly. It would also be good to add unit tests for that, to make sure we're not leaking any memory.

stephanie-wang · 2020-08-25T21:47:47Z

src/ray/core_worker/transport/direct_task_transport.cc

+        scheduling_key_entry.task_queue.push_back(task_spec);
+        if (scheduling_key_entry.total_tasks_in_flight <
+            scheduling_key_entry.active_workers.size() *
+                max_tasks_in_flight_per_worker_) {


Suggest wrapping this in a const method of the SchedulingKeyEntry (e.g., HasAvailableWorkers()) to make it more readable!

I was wondering, what do you mean by a const method? Also, I had to define the method outside of the SchedulingKeyEntry struct to be able to access the max_tasks_in_flight_per_worker_ member of CoreWorkerDirectTaskSubmitter. Otherwise, the compiler reported an error that I did not know how to fix. Do you know if there is a way to access max_tasks_in_flight_per_worker_ from a method within the SchedulingKeyEntry struct?

Ah I just mean to mark that the method does not modify the SchedulingKeyEntry, like this:
CanDelete() const { ... }

Yes, it shouldn't be able to access it because it's a private member of CoreWorkerDirectTaskSubmitter. You can fix it by allowing CanDelete to take the max tasks as an argument.

Sounds good! I'll add the const keyword, and allow CanDelete() (and the other two methods I added to check whether the pipeline fullness) to take max_tasks_in_flight_per_worker_ as an argument, so that I can place the function inside the struct

stephanie-wang · 2020-08-25T21:48:06Z

src/ray/core_worker/transport/direct_task_transport.cc

+            RAY_CHECK(worker_to_lease_entry_.find(active_worker_addr) !=
+                      worker_to_lease_entry_.end());
+            auto &lease_entry = worker_to_lease_entry_[active_worker_addr];
+            if (lease_entry.tasks_in_flight < max_tasks_in_flight_per_worker_) {


Suggest wrapping this in a const method of the WorkerLeaseEntry to make it more readable!

stephanie-wang · 2020-08-25T21:53:06Z

src/ray/core_worker/transport/direct_task_transport.cc

-  if (it == task_queues_.end()) {
+
+  auto &task_queue = scheduling_key_entry.task_queue;
+  if (task_queue.empty()) {
    // We don't have any of this type of task to run.


Shouldn't we check if it's okay to delete the SchedulingKeyEntry here too?

stephanie-wang · 2020-08-25T21:54:05Z

src/ray/core_worker/transport/direct_task_transport.cc

    }
    if (reply.worker_exiting()) {
      // The worker is draining and will shutdown after it is done. Don't return
      // it to the Raylet since that will kill it early.
      absl::MutexLock lock(&mu_);
      worker_to_lease_entry_.erase(addr);
+      auto &scheduling_key_entry = scheduling_key_entries_[scheduling_key];
+      scheduling_key_entry.active_workers.erase(addr);


Do we need to check if we should delete the SchedulingKeyEntry here?

stephanie-wang · 2020-08-25T21:55:17Z

src/ray/core_worker/transport/direct_task_transport.cc


-          if (scheduled_tasks->second.empty()) {
-            task_queues_.erase(scheduling_key);
+          if (scheduled_tasks.empty()) {
            CancelWorkerLeaseIfNeeded(scheduling_key);


Do we need to check whether we should delete the SchedulingKeyEntry here?

Yes! However, I think that the check should be placed inside the callback function in CancelWorkerLeaseIfNeeded, because the callback can call CancelWorkerLeaseIfNeeded as well, and CancelWorkerLeaseIfNeeded needs to access the SchedulingKeyEntry

src/ray/core_worker/transport/direct_task_transport.h

Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>

goliaro · 2020-08-26T22:01:10Z

Thanks, this is looking very close! I just had some questions about making sure we delete the SchedulingKeyEntry properly. It would also be good to add unit tests for that, to make sure we're not leaking any memory.

I was wondering what type of unit test you had in mind to check that we are not leaking memory by forgetting to delete some entries in the scheduling_key_entries hashmap. Because the hashmap is a private field of the CoreWorkerDirectTaskSubmitter class, we can't just check its size from within direct_task_transport_test.cc . Should I add a public function that allows us to check the size?

stephanie-wang · 2020-08-26T22:22:15Z

Yes, you can either add a public function to check the size or you can make the unit test class a friend of the CoreWorkerDirectTaskSubmitter class (see here).

stephanie-wang · 2020-08-28T20:46:26Z

src/ray/core_worker/transport/direct_task_transport.h

+    if (scheduling_key_entries_.size() != 0) {
+      RAY_LOG(INFO) << "size: " << scheduling_key_entries_.size();
+    }
+    return scheduling_key_entries_.size() == 0;


You can also use scheduling_key_entries_.empty() here!

Good call! Just changed this. I also removed the RAY_LOG(INFO) line

stephanie-wang

Looks great! I'll merge once Travis finishes.

stephanie-wang · 2020-08-28T23:48:27Z

Hey @gabrieleoliaro, looks like there is a build error. Could you fix it? https://api.travis-ci.com/v3/job/378975005/log.txt

stephanie-wang · 2020-08-29T00:32:03Z

@gabrieleoliaro, I'm not sure if that last commit will fix the issue. That annotation means that the compiler should check that the mutex is held whenever that method is called. But the method is public and the mutex is private, so I don't think it will work.

You can fix the error by acquiring the lock inside the method!

…lock

goliaro · 2020-08-29T00:49:17Z

Hey @gabrieleoliaro, looks like there is a build error. Could you fix it? https://api.travis-ci.com/v3/job/378975005/log.txt

Just pushed!

goliaro · 2020-08-29T00:50:12Z

@gabrieleoliaro, I'm not sure if that last commit will fix the issue. That annotation means that the compiler should check that the mutex is held whenever that method is called. But the method is public and the mutex is private, so I don't think it will work.

sorry, this was just a first commit. I was not done yet :)

ffbin · 2020-08-31T01:43:55Z

https://travis-ci.com/github/ray-project/ray/jobs/379015992 Hi @gabrieleoliaro , the java ci job failure is related to the pr, pls help take a look, thanks.

stephanie-wang · 2020-08-31T01:48:08Z

https://travis-ci.com/github/ray-project/ray/jobs/379015992 Hi @gabrieleoliaro , the java ci job failure is related to the pr, pls help take a look, thanks.

Really? Seems unlikely, the test is failing on master too: https://travis-ci.com/github/ray-project/ray/jobs/379177286

ffbin · 2020-08-31T01:50:27Z

https://travis-ci.com/github/ray-project/ray/jobs/379015992 Hi @gabrieleoliaro , the java ci job failure is related to the pr, pls help take a look, thanks.

Really? Seems unlikely, the test is failing on master too: https://travis-ci.com/github/ray-project/ray/jobs/379177286

Sorry, It should be irrelevant.

stephanie-wang self-assigned this Aug 20, 2020

stephanie-wang reviewed Aug 20, 2020

View reviewed changes

stephanie-wang added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Aug 20, 2020

Gabriele Oliaro added 5 commits August 22, 2020 01:14

requesting new workers only when pipelines to existing ones are full

967c924

linting

49c223a

added unit testing & linting

7635f2a

finished refactoring to consolidate all the fields that belong to a S…

d0acc2d

…chedulingKey into a single hashmap

linting

b847614

goliaro force-pushed the keeping_pipelines_full branch from 02125fc to b847614 Compare August 22, 2020 05:14

fixed bugs introduced by rebasing from new upstream master

14af137

stephanie-wang requested changes Aug 24, 2020

View reviewed changes

changes as part of the PR review process

586f2c8

ffbin reviewed Aug 25, 2020

View reviewed changes

src/ray/core_worker/transport/direct_task_transport.cc Outdated Show resolved Hide resolved

Fix typo in src/ray/core_worker/transport/direct_task_transport.cc

b6981d1

Co-authored-by: fangfengbin <869218239a@zju.edu.cn>

stephanie-wang reviewed Aug 25, 2020

View reviewed changes

Gabriele Oliaro and others added 2 commits August 25, 2020 15:39

Fixed comment in src/ray/core_worker/transport/direct_task_transport.cc

cdbfe54

Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>

second revision, with linting. all tests are passing locally

5cabefa

stephanie-wang reviewed Aug 25, 2020

View reviewed changes

Gabriele Oliaro and others added 2 commits August 26, 2020 15:58

Renamed SafeToDeleteEntry method in SchedulingKeyEntry

2abb1e3

Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>

all new revisions but the memory leak check. performed linting.

af371b0

Gabriele Oliaro added 2 commits August 28, 2020 16:18

added checks to make sure scheduling_key_entries does not leak memory

54e7753

linting. all checks passing locally

bd73cad

stephanie-wang reviewed Aug 28, 2020

View reviewed changes

stephanie-wang approved these changes Aug 28, 2020

View reviewed changes

stephanie-wang removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Aug 28, 2020

Gabriele Oliaro added 2 commits August 28, 2020 16:56

edited CheckNoSchedulingKeyEntries function

fc3e4ab

linting

dfced61

stephanie-wang added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Aug 28, 2020

fixed build error on mac

a6fc886

Gabriele Oliaro added 2 commits August 28, 2020 20:44

created public version of CheckNoSchedulingKeyEntries to acquire the …

3e9c453

…lock

linting

fee4076

stephanie-wang merged commit 05fe6dc into ray-project:master Aug 31, 2020

goliaro mentioned this pull request Sep 6, 2020

Work stealing #10607

Closed

5 tasks

	uint32_t tot_tasks_in_flight = 0;
	uint32_t total_tasks_in_flight = 0;

Keeping pipelines full #10225

Keeping pipelines full #10225

Conversation

goliaro commented Aug 20, 2020

Why are these changes needed?

Related issue number

Checks

Choose a reason for hiding this comment

stephanie-wang commented Aug 20, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goliaro Aug 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephanie-wang commented Aug 25, 2020 via email

stephanie-wang commented Aug 25, 2020 via email

goliaro commented Aug 25, 2020

stephanie-wang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goliaro Aug 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goliaro Aug 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goliaro commented Aug 25, 2020

stephanie-wang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goliaro Aug 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goliaro Aug 27, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

goliaro commented Aug 26, 2020

stephanie-wang commented Aug 26, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephanie-wang left a comment

Choose a reason for hiding this comment

stephanie-wang commented Aug 28, 2020

stephanie-wang commented Aug 29, 2020

goliaro commented Aug 29, 2020

goliaro commented Aug 29, 2020 • edited Loading

ffbin commented Aug 31, 2020

stephanie-wang commented Aug 31, 2020

ffbin commented Aug 31, 2020

goliaro Aug 25, 2020 •

edited

Loading

goliaro Aug 25, 2020 •

edited

Loading

goliaro Aug 25, 2020 •

edited

Loading

goliaro Aug 26, 2020 •

edited

Loading

goliaro Aug 27, 2020 •

edited

Loading

goliaro commented Aug 29, 2020 •

edited

Loading