Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable autodownload by default #2953

Merged
merged 34 commits into from Sep 22, 2021
Merged

Conversation

shrit
Copy link
Member

@shrit shrit commented May 24, 2021

Signed-off-by: Omar Shrit omar@shrit.me

Signed-off-by: Omar Shrit <omar@shrit.me>
@shrit shrit changed the title Disable autodownload be default Disable autodownload by default May 24, 2021
Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the idea here, but I think we'll need to make a few CI changes. ensmallen will need to be installed; I think you should be able to use apt-get, brew, and nuget in the various .ci/ files. There may be a bit of work to be done on the build nodes on Jenkins, but that should be pretty easy.

@shrit
Copy link
Member Author

shrit commented May 25, 2021

I will update the ci to download all necessary dependencies 👍

@rcurtin
Copy link
Member

rcurtin commented May 25, 2021

I installed libensmallen-dev and libstb-dev on all the CI build nodes I control, so hopefully the Jenkins jobs should now work correctly too. @zoq would you mind installing those packages on the CI nodes of yours? :)

@zoq
Copy link
Member

zoq commented May 26, 2021

Some or most of the jobs use https://github.com/mlpack/jenkins-conf/blob/master/Dockerfiles/jenkins-amd64-debian.Dockerfile so we only have to update the file and rebuild the docker. I'll update it in a second and push it.

@zoq
Copy link
Member

zoq commented May 27, 2021

Docker files updated, and images rebuild.

shrit added 4 commits May 27, 2021 16:41
Signed-off-by: Omar Shrit <omar@shrit.me>
Signed-off-by: Omar Shrit <omar@shrit.me>
Signed-off-by: Omar Shrit <omar@shrit.me>
Signed-off-by: Omar Shrit <omar@shrit.me>
.ci/linux-steps.yaml Outdated Show resolved Hide resolved
Signed-off-by: Omar Shrit <omar@shrit.me>
@shrit
Copy link
Member Author

shrit commented Jun 19, 2021

I found it strange that the CI was not able to locate libensmallen-dev maybe I did not write the package name correctly? it seems to be a part of the Ubuntu repository.
Also, there is a lib libstb-dev available for Ubuntu and Debian. Let me know if it has the same headers or, I will just wget it from mlpack website.

@rcurtin
Copy link
Member

rcurtin commented Jun 19, 2021

It looks like the Azure build workers use xenial, which is pretty old and doesn't have libensmallen-dev available (you can see this in the output, but the build doesn't fail). It seems like the easiest solution would be to just install ensmallen from source too. 👍

Signed-off-by: Omar Shrit <omar@shrit.me>
Signed-off-by: Omar Shrit <omar@shrit.me>
Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just left two comments that should hopefully fix the build. 👍

.ci/linux-steps.yaml Outdated Show resolved Hide resolved
.ci/linux-steps.yaml Outdated Show resolved Hide resolved
shrit and others added 2 commits June 23, 2021 19:24
Co-authored-by: Ryan Curtin <ryan@ratml.org>
Co-authored-by: Ryan Curtin <ryan@ratml.org>
.ci/windows-steps.yaml Outdated Show resolved Hide resolved
Signed-off-by: Omar Shrit <omar@shrit.me>
.ci/macos-steps.yaml Outdated Show resolved Hide resolved
Signed-off-by: Omar Shrit <omar@shrit.me>
@shrit
Copy link
Member Author

shrit commented Sep 16, 2021

@rcurtin Good news, Armadillo library has been finally found on windows, let us see if there will be no linking issues, if yes then we can merge this pull request 👍

@shrit
Copy link
Member Author

shrit commented Sep 16, 2021

Here is the last error from this pull request,
@rcurtin, I think this error is coming from the armadillo, any possibilities of an ABI incompatibility?

##[error]mlpack.lib(adaboost_model.obj)(0,0): Error LNK2001: unresolved external symbol "class std::mersenne_twister_engine<unsigned __int64,64,312,156,31,-5403634167711393303,29,6148914691236517205,17,8202884508482404352,37,-2270628950310912,43,6364136223846793005> arma::mt19937_64_instance" (?mt19937_64_instance@arma@@3V?$mersenne_twister_engine@_K$0EA@$0BDI@$0JM@$0BP@$0?EKPNJAKFFGJJOGBH@$0BN@$0FFFFFFFFFFFFFFFF@$0BB@$0HBNGHPPPONKGAAAA@$0CF@$0?IBBCAAAAAAAAA@$0CL@$0FIFBPECNEMJFHPCN@@std@@A)

@rcurtin
Copy link
Member

rcurtin commented Sep 17, 2021

Wait a second... mt19937_64_instance is not a symbol that exists in Armadillo 9.800.6! Is it possible that Armadillo is installed twice on the system? I think this might be the case---is the ensmallen nuget package shipping with a bundled Armadillo? I think this could cause a problem if both ENSMALLEN_INCLUDE_DIR and ARMADILLO_INCLUDE_DIR have armadillo_bits/ subdirectories---the compiler might include the wrong one.

If this is correct (please double-check my reasoning above), then I see two options:

  1. Remove the Armadillo headers that shipped with ensmallen, so that we are always using Armadillo 9.800.6 in this build.
  2. Use the Armadillo headers that shipped with ensmallen, instead of downloading and compiling Armadillo locally.

I think the first solution might be preferable, since it always allows us to build mlpack against the minimum required version of Armadillo.

Comment on lines 18 to 20
tree $(Agent.ToolsDirectory)
## Delete all ensmallen dependencies, we do not need them here
Remove-Item $(Agent.ToolsDirectory)\ensmallen.2.17.0\installed\x64-linux\share -Force -Recurse
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcurtin I have already done that, I suppose that maybe it is not working? I will try to verify if the directory is still there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, I missed that. Well, maybe it is worth printing everything in the ensmallen directory, just to make sure the headers are not there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I find strange is that it has never been able to link with any armadillo version including the one that is in ensmallen package. However, when I corrected the path to add the tree\include, it showed behaviour similar as if it was linking with the package.
I have added a tree to see if armadillo is not deleted from the ensmallen package, let us see.

Copy link
Member Author

@shrit shrit Sep 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rcurtin we are sure now that visual studio is trying to use the local headers in the ensmallen package, which is causing all of these issues.
Have a look here
https://dev.azure.com/mlpack/mlpack/_build/results?buildId=7292&view=logs&j=934eac7e-378e-5e05-ed87-f05e34fa9fdc&t=4ee4870b-c6ca-5a83-cdbe-639dd57a3582&l=9973

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at the error:

##[error]C:\hostedtoolcache\windows\ensmallen.2.17.0\installed\x64-linux\include\armadillo(21,10): Error C1083: Cannot open include file: 'armadillo_bits/compiler_check.hpp': No such file or directory

I think that you have removed everything of Armadillo that's distributed with ensmallen... except the armadillo file. So if you remove that too, in addition to the armadillo_bits/ directory, then I think we will be closer to something that works. :)

Signed-off-by: Omar Shrit <omar@shrit.me>
Signed-off-by: Omar Shrit <omar@shrit.me>
Signed-off-by: Omar Shrit <omar@shrit.me>
@shrit
Copy link
Member Author

shrit commented Sep 20, 2021

@rcurtin I think it has passed now, finally, I only got out of heap space error, which is normal from time to time.
I restarted the windows build to see if it may pass this time.

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good to me here. Thank you so much for putting so much effort into this! Before merge, I think we should still make sure that the Windows build succeeds at least once (we might have to restart it a few times).

.ci/windows-steps.yaml Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Member

@zoq zoq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good idea to clean the commit history or we just do a squash & merge and clean it there.

nuget install unofficial-flayan-cereal -o $(Agent.ToolsDirectory)

nuget install ensmallen -o $(Agent.ToolsDirectory) -Version 2.17.0
## Delete all ensmallen dependencies including armadillo headers, we do not need them here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I get the reason why we download the ensmallen package from nuget, and delete everything besides the header files. In the linux build we just download the header files and copy everything into the target directory. Is there anything that I'm missing here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, I do not remember why we went into NuGet direction, there was a strong reason, but I can not remember why

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After all of this suffering, I think we can merge this pull request. Whatever the first reason was, it is better to use NuGet for future versions, we will have to change only the number now for ensmallen

shrit and others added 3 commits September 22, 2021 00:07
Co-authored-by: Ryan Curtin <ryan@ratml.org>
Co-authored-by: Ryan Curtin <ryan@ratml.org>
Co-authored-by: Ryan Curtin <ryan@ratml.org>
Copy link

@mlpack-bot mlpack-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second approval provided automatically after 24 hours. 👍

@rcurtin
Copy link
Member

rcurtin commented Sep 22, 2021

Still a memory error; I tried restarting the build again.

@shrit
Copy link
Member Author

shrit commented Sep 22, 2021

@rcurtin was it a linking error?

@rcurtin
Copy link
Member

rcurtin commented Sep 22, 2021

Nope, just 'compiler out of heap space'. I'd just like to make sure we can at least get it to build cleanly once. :)

@shrit
Copy link
Member Author

shrit commented Sep 22, 2021

In this case, the issue will be coming from armadillo 9.8 since it is the first time we try to compile mlpack with it on Windows.
I do not know how we can avoid these, maybe cut the tests examples into several units?

@shrit
Copy link
Member Author

shrit commented Sep 22, 2021

All green 💯 🚀

@jeffin143 jeffin143 merged commit 7ffc808 into mlpack:master Sep 22, 2021
@rcurtin
Copy link
Member

rcurtin commented Sep 22, 2021

Awesome! Great work @shrit, super cool we were finally able to get this one working right. :)

@shrit shrit deleted the disable_autodownloader branch September 22, 2021 17:47
@shrit
Copy link
Member Author

shrit commented Sep 22, 2021

@rcurtin @jeffin143 thanks, I think we are close to merging #3006, if you have any spare time, do not hesitate in giving hints on the Unicode strange loop behaviors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants