Skip to content

Commit

Permalink
Move to deprecated branches dir
Browse files Browse the repository at this point in the history
  • Loading branch information
phlapjack committed May 3, 2008
1 parent 24426a5 commit 037b14c
Show file tree
Hide file tree
Showing 75 changed files with 10,890 additions and 0 deletions.
118 changes: 118 additions & 0 deletions ts_manager/History.txt
@@ -0,0 +1,118 @@
== 0.9.3 2008-04-10
- Support starting Skynet with ./script/skynet start and stop to daemonize
- Close file handles on exec.
Skynet::Worker and Skynet::Manager now call Skynet.fork_and_exec instead of their own versions.
Skynet.fork_and_exec prevents file descriptor exhaustion by calling Skynet.close_file_handles.
Skynet::Manager detatches from console by calling Skynet.close_console
- Added printlog logging method which always prints to the log as [LOG]
- Deprecated Skynet.new to Skynet.start
- Mysql Message Queue Adapter - Make delete_expired_messages much safer.
- ActiveRecord::Base.distributed_find - Patch submitted by Lourens Naude (lourens@methodmissing.com) which checks the model for the primary_key name as opposed to assuming it is
'id'
- We don't want to use rails constantize so I've temporarily borrowed the method from ActiveSupport inflector and added it to skynet_ruby_extensions.
- Fix bug in Job comment where it referenced MapreduceTest instead of Skynet::MapreduceTest
- Fix tests. For some reason you still can't run ALL the test at once with rake test, but if the files are run individually they all pass.

== 0.9.2 2008-01-22
Highlights:
- Multiple Message Queues
- Many more Job options including options to control how jobs are distributed.
- The various options for how a job is run has been made much clearer.
- The Mysql Message Queue Adapter has been optimized and made more reliable.
- You can now control how many times skynet retries failed master, map and reduce tasks.
- Large data sets can now be streamed to the queue.

Details:

Active::Record#distributed_find
- Active Record distributed_find now handles REALLY large sets by breaking them into seperate jobs. 1MM models per master broken into ranges of 1000.

Skynet::Job
- code path through Skynet::Job is now clear. There are 3 ways to run Skynet::Job. Local Master (default), Remote Master, Async (implies remote master)
- Skynet::Job supports keep_map_tasks and keep_reduce_tasks settings.
If true, the master will run the tasks locally.
If a number is provided, the master will run the tasks locally if there are LESS THAN OR EQUAL TO the number provided
There are also Skynet::CONFIG settings for defaults. DEFAULT_KEEP_REDUCE_TASKS, DEFAULT_KEEP_MAP_TASKS
I can see there being a problem with the timeouts being a little off... Since your kinda in a master and kinda in a map or reduce timeout. The task timeouts will be correct at least. Though, you won't get the benefit of redoes yet.
- Skynet::Job now supports setting RETRY times per job by MASTER, MAP and REDUCE. So you can have a MASTER_RETRY=0, but have MAP_RETRY=2 and REDUCE_RETRY=3. There are now defaults for those as well :DEFAULT_MASTER_RETRY, :DEFAULT_MAP_RETRY, :DEFAULT_REDUCE_RETRY. If a message passes its RETRY it will be marked with an iteration of -1. delete_expired_messages removes those messages as well. These show up in the stats as :failed_tasks. Skynet::Task and Skynet::Message now have retry fields denoting the maximum number of retries.
- You can now pass queue_id or queue to Skynet::Job
- There is now a :MAX_RETRIES config setting that controls how many iterations Skynet will even look for tasks as well.
- You can now stream map_data to the queue by passing an Enumerable for your map_data
- Refactored Skynet::Job to be much cleaner and easier to test.
- Skynet::Job now has access to a local queue which it can treat almost like the real one.
- deprecate Skynet::Job#run_master
- rename reduce_partitioner method to just reduce_partition
- Skynet::Job has better support for running tasks locally. A job may run tasks in its own process if
you are running in solo mode, you've made a "single" job, or you set the keep_map_tasks or keep_reduce_tasks below.
When a job runs tasks locally it now honors the retry settings and timeouts.
- Skynet::Job and Skynet::AsyncJob are now almost identical. In fact you can just use Skynet::Job and tell it to run async.
- Skynet::Job Changed map_tasks and reduce_tasks to mappers and reducers respectively. This was to remove the ambiguity between the actual map/reduce tasks and the number of mappers/reducers desired.
- Skynet::Jobs can not be told what queue to use for that job by passing :queue or :queue_id DEFAULT 0
- Skynet::Job won't call the reduce_partitioner if there are no valid results from the map_step.

MapreduceHelper mixin
- You can include MapreduceHelper into your class and then implement self.map_each and self.reduce_each methods. The included self.map and self.reduce methods will handle iterating over the map_data and reduce_data, passing each element to your map_each and reduce_each methods respectively. They will also handle error handling within that loop to make sure even if a single map or reduce fails, processing will continue. If you do not want processing to continue if a map fails, do not use the MapreduceHelper mixin.

Multiple Message Queues!
- Add the ability to have multiple message queues in the same table message_queue_table.
- You can start skynet with a --queue_id or --queue option to determine which queue workers should look in.
- Skynet::Jobs can not be told what queue to use for that job by passing :queue or :queue_id. DEFAULT 0
- Queues can be configred via Skynet::CONFIG[:MESSAGE_QUEUES] = [] which comes with an array of queues id 1 through 10 named "one" through "ten"

Skynet Console
- You now start the skynet console by running 'skynet console' at the command line. There is no longer a skynet_console app.
- The console now loads the configs that are in your local skynet script.

Skynet::Config
- Added Skynet.silent {} Runs your code with no debugging output.
- Made sure all config options for TupleSpace? adapter begin with TS and all Mysql adapter CONFIG settings start with MYSQL.
- There is now a :MAX_RETRIES config setting that controls how many iterations Skynet will even look for tasks as well.

Skynet::Partitioners
- Created a Skynet::Partitioners class where various partitioners can be found. To specify one of them merely provide that specific Skynet::Partitioners subclass as your reduce_partitioner in your Skynet::Job

Skynet Install
- skynet_install now has a --mysql option. This installs the migration as well as a skynet_schema.sql file.

Mysql Message Queue Adapter
- You can now configure your database options outside of rails with Skynet::CONFIG[:MYSQL_*] options.
- Mysql adapter now updates the updated_on time of rows in skynet_message_queues
- Fixed Mysql Message Adaptor to take_next_task safer and more efficiently. There seems to be far less risk of a race condition where two workers would take the same task.
- Eliminated 1 db update per every task taken making it MUCH more efficient.
- Skynet::MessageQueueAdaptor::Mysql now tries to reconnect if it gets disconnected. This was to solve the "Mysql Server has Gone Away" errors.
- Implemented version_active? in mysql message queue adapter. It's a way for workers to check to see if a version is still in the queue.

Skynet::Task
- Skynet::Task#master_task takes care of creating the master job and task now. I have mixed feelings about this.
- ENFORCED TIMEOUTS - Even though a master might give up on a worker if it didn't respond in time, there was nothing to step any given worker from running forever. We now enforce the timeouts (master_timeout, map_timeout, reduce_timeout) given in Skynet::Job using the Timeout module. This causes a Timeout::Error to be thrown. If you are using the mysql adapter, this can cause strange results sometimes. If the Timeout error is thrown during a DB query, ActiveRecord will throw an ActiveRecord::StatementInvalid exception which includes the Timeout::Error exception in it. Not sure how to prevent that from happening.

Skynet::Worker
- Workers now have a Skynet::CONFIG[:WORKER_MAX_PROCESSED] setting to control when to respawn based on how many that worker has processed.
- You can start skynet with a --queue_id or --queue option to determine which queue workers should look in.
- Workers do not restart until there are no more items in the queue of that version.

Skynet::Message
- Skynet::Message now stores fields as an array. It is far more efficient now as well.

Now 90% more Tests!

BUGFIXES
- Skynet Workers now restart properly when the worker_version changes.
- starting tuplespce_server, you no longer need to provide the --port if you're already providing the drburi-
- Fixed a bug where Skynet::MessageQueueAdapter::Mysql would sometimes pick up tasks another worker had already picked up.
- Fix bug in Skynet::Worker where it wouldn't die right if the max processed was reached.
- Workers were supposed to restart when the worker_version changed. They do that properly now.

Thanks to Jason Rimmer for finding these bugs.
- Fix bug in Skynet::Message where it would calculate the iteration improperly.
- The skynet gem appears to be missing the Rubigen dependency
- Running skynet_install even without the rails arg still generates code with rails dependencies and tailings: RAILS_ROOT, RAILS_ENV, and the various directory tailings such as 'db/migrate', etc.
- Generated skynet script is missing "require 'rubygems'"
- Specification of pid directory and file is incorrect as 'skynet_manager.rb' wants only a directory with it specifying the file
- The sleep while waiting to start the queue server isn't long enough. There is now a CONFIG setting TS_SERVER_START_DELAY.


== 0.0.1 2007-12-16

* 1 major enhancement:
* Initial release
20 changes: 20 additions & 0 deletions ts_manager/License.txt
@@ -0,0 +1,20 @@
Copyright (c) 2007 Adam Pisoni, Geni.com

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

0 comments on commit 037b14c

Please sign in to comment.