Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record runs #232

Merged
merged 1 commit into from
Jun 8, 2017
Merged

Record runs #232

merged 1 commit into from
Jun 8, 2017

Conversation

saulshanabrook
Copy link
Collaborator

@saulshanabrook saulshanabrook commented Mar 17, 2017

All this needs to happen before I encourage anyone to start trying this out:

  • get new schemas working
    • check to make sure can take clojush schema
    • change adding values to support, maybe using spec and multimethods (forget about speed)
    • switch clojush code to support
    • speed up!
  • get working with HDSF
    • make sure hdfs deploys ok
    • change tests to use it
    • try on real run
    • instead of writing to HDFS, write to local filesystem and copy (profile this)
  • create example queries:
    • overview of runs: uuid, problem-file, name, generation, status
    • overview of one run: given uuid -> index, best-error best-generalization-error
    • replicate amps results
  • Write documentation about rational and how to use on discourse or here]
  • try with parquet version 2 file
  • remove proto-repl-sayid deps
  • add command line field to make this optional
  • try v2 on spark and see if that works
  • deploy default spark w/ jupyter instance on docker
    • move reusable stuff to another notebook
  • add time (waiting on [SPARK-10364][SQL] Support Parquet logical type TIMESTAMP_MILLIS apache/spark#15332)
  • change to 0mq
  • move clojure code back to ici-recorder
  • extract spark generation to seperate file
  • add test sending JSON data and running it processing in spark

@lspector
Copy link
Owner

lspector commented Mar 18, 2017

Just to clarify before I merge:

  • None of this will have any effect (including re: outputs produced) if run is called with default args?

  • I don't need to worry about that "All checks have failed" business?

@saulshanabrook
Copy link
Collaborator Author

Don't merge yet! I just put this up to be public about my progress, but I want to check off all those things in my todo first before merging.

@lspector
Copy link
Owner

Ah, thanks. Maybe for future PRs like this make that status a little more obvious for me?

@saulshanabrook
Copy link
Collaborator Author

Sure, the WIP in the title means "work in progress" aka not ready yet

@lspector
Copy link
Owner

Ah! Great & thanks for cluing me in.

@saulshanabrook
Copy link
Collaborator Author

saulshanabrook commented May 22, 2017

I tried to be as minimally invasive to the code base as possible. There is still some progress to make on this (as documented on discourse), but I think this is in a good enough shape to merge. I have been using this branch and it has been working out.

@saulshanabrook saulshanabrook changed the title WIP: Record runs with Parquet Record runs May 22, 2017
@saulshanabrook saulshanabrook force-pushed the add-record branch 2 times, most recently from eebe625 to eb2296f Compare June 8, 2017 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants