Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Support more sinks/sources in scalding-spark #1988

Open
daniel-sudz opened this issue Apr 11, 2022 · 1 comment
Open

[Proposal] Support more sinks/sources in scalding-spark #1988

daniel-sudz opened this issue Apr 11, 2022 · 1 comment

Comments

@daniel-sudz
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Currently scalding-spark only supports a few scalding sources/sinks. I am interested in adding support for some more common use cases.

The current support is:

  /**
   * This has a mappings for some built in scalding sinks currently only WritableSequenceFile and TextLine are
   * supported
   *
   * users can add their own implementations and compose Resolvers using orElse
   */
  val Default: Resolver[Output, SparkSink] =
    new Resolver[Output, SparkSink] {
      def apply[A](i: Output[A]): Option[SparkSink[A]] =
        i match {
          case ws @ WritableSequenceFile(path, fields, sinkMode) =>
            Some(writableSequenceFile(path, ws.keyType, ws.valueType).asInstanceOf[SparkSink[A]])
          case tl: TextLine =>
            Some(textLine(tl.localPaths.head).asInstanceOf[SparkSink[A]])
          case _ =>
            None
        }
    }
}
@johnynek
Copy link
Collaborator

I think this is a good goal, but trying to make the current common inputs independent of cascading will be pretty hard...

Instead, I think you could probably make it work by admitting cascading to the classpath, but excluding Hadoop and just not triggering hadoop in the runtime, since you are only exercising equals and isInstanceOf to do these kinds of matches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants