Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Starting DataCleaner on Windows? #1926

Open
justinphebey opened this issue Jul 1, 2022 · 6 comments
Open

Question: Starting DataCleaner on Windows? #1926

justinphebey opened this issue Jul 1, 2022 · 6 comments
Labels

Comments

@justinphebey
Copy link

justinphebey commented Jul 1, 2022

Hi, I see since version 5.2.1 it seems that the releases are cross platform and I'm trying to use the latest (5.8.1) but how do you start the application on Windows?

If I double click on datacleaner.cmd it opens up a command window briefly and then closes. I've tried increasing the memory allocation but that doesn't help. If I use the Pentaho plugin (5.4.0), that manages to open it up fine regardless of making memory changes within the datacleaner.cmd file.

I have openjdk8jre 8.252.9 installed to suit Pentaho Data Integration 9.2.0.0-290 and these are installed with a Windows 10 Pro install within a Hyper-V VM.

Kind regards
Justin

@justinphebey
Copy link
Author

justinphebey commented Jul 2, 2022

I found a bit more regarding what's causing the issue...

When I installed the Penaho plugin (5.4.0) I discovered it wouldn't open DataCleaner. I found a rather old article which suggested some troubleshooting advice to swap out the 'commons-vfs2-x.x.jar' with a newer version.

https://holowczak.com/data-profiling-with-datacleaner-and-pentaho-data-integration/3/

Since I already have Pentaho installed I replaced the DataCleaner version with the one from Pentaho which is 'commons-vfs2-2.7.0.jar'. This fixed the Pentaho plugin being able to open DataCleaner since I also made the change to the Pentaho data-integration/launcher/launcher.properties file to append :../../../DataCleaner/lib to the list of libraries as suggested in a plugin issue repo. ticket (datacleaner/pdi-datacleaner#42 (comment)).

The following two attachments show the output of trying to open DataCleaner directly from double clicking the datacleaner.cmd file. The original vfs2 jar succeeds but looks like it has a logging issue as each logging entry is repeated. The newer vfs2 jar causes DataCleaner to fail to open unless done via the Pentaho plugin.

Open DataCleaner with vfs2-2-1.txt
Open DataCleaner with vfs2-2-7-0.txt

Kind regards
Justin

@kaspersorensen
Copy link
Member

Hi @justinphebey

Replacing the commons-vfs file in the lib directory is not the way to go. DC expects the older version to be there and from your txt file I can see that now it's not being loaded.

I would suggest you try to open up a command prompt and run the .cmd file from there to see what output you're getting. I don't see any errors from the first txt/log file you provided.

@justinphebey
Copy link
Author

Hi @kaspersorensen

Thanks for getting back to me.

The first text file was for the original commons-vfs file that ships with DC (5.8.1) and that does indeed allow DC to open when opening it with datacleaner.cmd.

When I raised the question I had the newer version of the commons-vfs file in place to solve the issue of it not opening via Pentaho Data Integration plugin. Therefore to reclarify, the issue is actually with the DC PDI plugin not being able to open DC via the PDI plugin where DC has the commons-vfs2-2.1 file that it ships with present.

Kind regards
Justin

@kaspersorensen
Copy link
Member

Ah! I see.
I have to admit that it's been years since I worked on that plugin. You can find it in this repo: https://github.com/datacleaner/pdi-datacleaner

From what I recall, we tried to solve this issue (and other dependency mismatch issues) by having the plugin point at a DC installation directory instead of embedding DC into Pentaho itself. But it could sound like there are maybe still some limitations.

@justinphebey
Copy link
Author

justinphebey commented Jul 6, 2022

@kaspersorensen

That's right, I did the change to the launcher.properties file in PDI but that only works if also replacing the commons-vfs file in DC but then that causes the issue with DC not opening directly from datacleaner.cmd. It would work if at some point DC can be updated to ship with the newer commons-vfs file though.

Kind regards
Justin

@kaspersorensen
Copy link
Member

I see, yes.
Upgrading dependencies is always a nice improvement. Happily taking contributions for that sorta stuff :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants