Skip to content

Start work on scalding-lingual support.#961

Open
richwhitjr wants to merge 4 commits intotwitter:developfrom
richwhitjr:develop
Open

Start work on scalding-lingual support.#961
richwhitjr wants to merge 4 commits intotwitter:developfrom
richwhitjr:develop

Conversation

@richwhitjr
Copy link
Contributor

Lingual has some nice features that would be nice to have for some small adhoc jobs and especially with the Scalding REPL. These changes do not contain the repl code yet(coming in a future pull request) but attempt make lingual a bit easier to use with Scalding libraries and allow for scalding code to still be written and run with a Lingual job.

This code also does not support nested functions yet in queries like thrift. Should be technically possible based on conversations with the Lingual team.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a bad whitespace change

@richwhitjr
Copy link
Contributor Author

Modified scald.rb also here. It look kind of broken if you gave it a jar. Now if you pass it a jar and it exists on disk it will use it. If it doesn't it will attempt to fall back to the twitter jar scheme.

Also added a --lingual command to run scalding with the lingual additions.

@richwhitjr
Copy link
Contributor Author

Also added a bit of code in scald.rb to allow a user to set the java heap memory for job compile. Run into issues here occasionally and has been happening frequently to coworkers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and openFile, for example) have no dependency to the rest of the code. Consider moving into the class object so it stays that way, and it's clear what are just short utilities, and what aren't?

Either that or make them private. I'm questioning the utility of making public something so simple, though I understand the internal utility.

@richwhitjr
Copy link
Contributor Author

@ianoc @jcoveney Any more feedback on this? If you guys don't think its worth merging I can rethink my approach.

@jcoveney
Copy link
Contributor

jcoveney commented Sep 4, 2014

Have you merged in a while? I have some bandwidth to take a look but I think you need to bring in most recent changes to scalding

Richard Whitcomb added 2 commits September 4, 2014 13:53
…rastructure, repl work coming later

Adding files that I missed in the last commit

Adding a scalding-lingual readme

Remove excludes, fix broken scald.rb and whitespace issue

Modify scald.rb so if it is given a real file for a jar use it, if it is given a jar it can't find them fallback to the twitter version

More cleanup around scalding-lingual and make it easier to add lingual classpath

Add a missing file and allow for people to set memory needed for scald.rb job compile

Some feedback based on jco review, fix the build because of caps in a filename

Remove not needed functions

Start of a work allowing for lingual jobs to make use of Scalding infrastructure, repl work coming later
@richwhitjr
Copy link
Contributor Author

Merged now. Did a squash also to cleanup a lot of the small commits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with implicits it's always good practice to make the return type explicit (mainly because it the class hierarchy is changed later on it could change the implicit functions in ways you do not expect)

@jcoveney
Copy link
Contributor

jcoveney commented Sep 4, 2014

Nice work. I do not love the "lingual DSL" approach (esp. given the more library-oriented approach a lot of scalding is going), but we can flag the project as "in progress" and let people mess around with it and see what shakes out.

What do you think @johnynek @ianoc ?

@richwhitjr
Copy link
Contributor Author

The DSL approach is a bit odd but it would nice to see it eventually better integrated into Scalding. Currently the Job abstraction and Source abstraction on top of Cascading provides great simplifications for a lingual job.

My thought here was to get something simple out and see how the community responds.

@johnynek
Copy link
Collaborator

johnynek commented Sep 6, 2014

Sorry for the slow response on my part. I'll give a thorough review this week.

@CLAassistant
Copy link

CLAassistant commented Jul 29, 2019

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Richard Whitcomb seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants