Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running HyUCC algorithm on cli #17

Closed
faisal-ksolves opened this issue Nov 2, 2022 · 4 comments
Closed

Running HyUCC algorithm on cli #17

faisal-ksolves opened this issue Nov 2, 2022 · 4 comments

Comments

@faisal-ksolves
Copy link

hello everyone,
im trying to run HyUCC algorithm on cli with the following command
java -cp metanome-cli-1.1.0.jar:HyUCC-1.2-SNAPSHOT.jar de.metanome.cli.App --algorithm de.metanome.algorithms.hyucc.HyUCC --file-key INPUT_GENERATOR --files load:/home/faisal/ksolves/metanome/cli/WDC_age.csv
But it throws some error ::-

Running de.metanome.algorithms.hyucc.HyUCC

  • in: [load:/home/faisal/ksolves/metanome/cli/WDC_age.csv]
  • out: file
  • configuration: []
    Initializing algorithm.
    Could not initialize algorithm.
    de.metanome.algorithm_integration.AlgorithmConfigurationException: File not found!

what should i do now?
can anyone help me

@sekruse
Copy link
Owner

sekruse commented Nov 3, 2022

Hi, the load: in the --files parameter looks incorrect to me. Assuming you want to profile WDC_age.csv, then it should just be

--files /home/faisal/ksolves/metanome/cli/WDC_age.csv

The description of the --files parameter isn't super clear, admittedly, but load: should only be used if you have a file that contains a list of files to be analyzed:

input file/tables to be analyzed and/or files list input files/tables (prefixed with 'load:')

@faisal-ksolves
Copy link
Author

thanks @sekruse it has been resolved with the following command
java -cp metanome-cli-1.1.0.jar:HyUCC-1.2-SNAPSHOT.jar de.metanome.cli.App --algorithm de.metanome.algorithms.hyucc.HyUCC --file-key INPUT_GENERATOR --files WDC_age.csv

@faisal-ksolves
Copy link
Author

But here is another question, Can we convert this project into spark?

@sekruse
Copy link
Owner

sekruse commented Nov 4, 2022

Glad to hear your problem is resolved!

For your second question, the answer is unfortunately: No. But that's also more a question for HyUCC, which is the algorithm doing all the heavy lifting. I am not aware of a meaningful way to implement UCC discovery on Spark that would beat single-machine performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants