Skip to content

Schema Integration! #232

sritchie opened this Issue Feb 14, 2014 · 3 comments

2 participants


I'd love to see someone take on integration of Prismatic's Schema library with Cascalog. The ability to write schemafied operations, and have the Cascalog compiler validate schemas before submitting jobs, could avoid the runtime errors that are one of the only downsides to Cascalog :)

@sritchie sritchie added the Idea label Feb 14, 2014

I love your idea. Can you develop it or just share the use case you have in mind ?

I thought using thrift or any other serializer helps to enforce the schema.

Is it an alternative or they can play nicely together ?

In the same spirit as core.typed and schema
"core.typed has accurate compile time checking, and Schema gives an expressive contracts interface for runtime checking." in



Yeah, for sure.

Thrift is nice for enforcing a schema when you write to disk - that safety kicks in when your job is running on the cluster. If you try to populate thrift objects with items of the wrong type, you'll get runtime exceptions after job submission. This is painful, and a big waste of time.

If Cascalog query definitions could use schema to check the input and output types of predicates, then Schema's "runtime" guarantee would prevent badly typed jobs from being submitted. So the runtime here is really a second compile time. This would play really nicely with Thrift.


Thanks for the explanation.
Sounds really exciting... Hope to get some time investigating it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.