New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Hashset to GetAllReferencedProjects to only load project once #2409

Merged
merged 13 commits into from Jun 11, 2017

Conversation

Projects
None yet
3 participants
@gerardtoconnor
Contributor

gerardtoconnor commented Jun 10, 2017

This PR is to prevent the same project paths being loaded again if already loaded using a hashset of the project paths to check against before loading.

@forki

This comment has been minimized.

Show comment
Hide comment
@forki

forki Jun 10, 2017

Member

I think we also need to cache multiple calls to it

Member

forki commented Jun 10, 2017

I think we also need to cache multiple calls to it

Create ProjectFile.fs
need brackets on not function
@gerardtoconnor

This comment has been minimized.

Show comment
Hide comment
@gerardtoconnor

gerardtoconnor Jun 10, 2017

Contributor

I'm not sure how much you want to optimize this but, as well as caching, I believe we be able to run the fetch in parallel if we use Async XML node readers in the path loads ... I've only started looking at Paket today so not sure how much you want me to investigate performance tweaks?

Contributor

gerardtoconnor commented Jun 10, 2017

I'm not sure how much you want to optimize this but, as well as caching, I believe we be able to run the fetch in parallel if we use Async XML node readers in the path loads ... I've only started looking at Paket today so not sure how much you want me to investigate performance tweaks?

@forki

This comment has been minimized.

Show comment
Hide comment
@forki

forki Jun 10, 2017

Member

I think it's good for now. We will release it and ask for feedback. Thank you

Member

forki commented Jun 10, 2017

I think it's good for now. We will release it and ask for feedback. Thank you

@gerardtoconnor

This comment has been minimized.

Show comment
Hide comment
@gerardtoconnor

gerardtoconnor Jun 10, 2017

Contributor

I added a DependencyCache field to ProjectFile, not ideal but given the dependencies can change between pack runs, and ProjectFile obj only created once, caching to obj is better than IO/Persisted etc that will load stale data on subsequent runs.

Contributor

gerardtoconnor commented Jun 10, 2017

I added a DependencyCache field to ProjectFile, not ideal but given the dependencies can change between pack runs, and ProjectFile obj only created once, caching to obj is better than IO/Persisted etc that will load stale data on subsequent runs.

@matthid

This comment has been minimized.

Show comment
Hide comment
@matthid

matthid Jun 10, 2017

Member

To be honest I'd prefer the first solution (sorry @gerardtoconnor). Reason is that I don't think that additional field yields a lot of benefit. Of course I'd be easily convinced by numbers :)

Member

matthid commented Jun 10, 2017

To be honest I'd prefer the first solution (sorry @gerardtoconnor). Reason is that I don't think that additional field yields a lot of benefit. Of course I'd be easily convinced by numbers :)

@matthid

This comment has been minimized.

Show comment
Hide comment
@matthid

matthid Jun 10, 2017

Member

Either way lets see what the CI says

Member

matthid commented Jun 10, 2017

Either way lets see what the CI says

@gerardtoconnor

This comment has been minimized.

Show comment
Hide comment
@gerardtoconnor

gerardtoconnor Jun 10, 2017

Contributor

@matthid I ironically agree also, the only other option is to cache the result appropriately in each method where the dependencies are called, as high up as possible such that all child functions use the result, not the function, may require a change in some function signatures to include dependencies ... I'll look into it now.

Contributor

gerardtoconnor commented Jun 10, 2017

@matthid I ironically agree also, the only other option is to cache the result appropriately in each method where the dependencies are called, as high up as possible such that all child functions use the result, not the function, may require a change in some function signatures to include dependencies ... I'll look into it now.

@gerardtoconnor

This comment has been minimized.

Show comment
Hide comment
@gerardtoconnor

gerardtoconnor Jun 10, 2017

Contributor

I have now updated with a cache path dictionary that is created on the Pack start and propagates through to all callers for GetAllReferences()

Contributor

gerardtoconnor commented Jun 10, 2017

I have now updated with a cache path dictionary that is created on the Pack start and propagates through to all callers for GetAllReferences()

@gerardtoconnor

This comment has been minimized.

Show comment
Hide comment
@gerardtoconnor

gerardtoconnor Jun 10, 2017

Contributor

I have added caching to dependency reference also, to reduce the memory footprint, I used hash of reference paths rather than large lists and dictionaries of string. Would be great if we could test this on the big project that was having the issue before merged.

Contributor

gerardtoconnor commented Jun 10, 2017

I have added caching to dependency reference also, to reduce the memory footprint, I used hash of reference paths rather than large lists and dictionaries of string. Would be great if we could test this on the big project that was having the issue before merged.

@matthid

This comment has been minimized.

Show comment
Hide comment
@matthid

matthid Jun 10, 2017

Member

Travis says there are some problems. To be completly honest with you I'd rather go with the most simple implementation you started with and maybe use the new "tracing" capabilities of paket5 to see if performance is actually still a problem afterwards.

Note that I don't have problems with the breaking changes in particular (we already have a lot for paket5) but it might be a problem if methods get an additional parameter where it is not clear (on first sight) why they are needed.

Sorry for the back and forth and clarifying so late. It's only my personal opinion, @forki might have another one and I'm completely fine with merging this - once it's green ;)

Member

matthid commented Jun 10, 2017

Travis says there are some problems. To be completly honest with you I'd rather go with the most simple implementation you started with and maybe use the new "tracing" capabilities of paket5 to see if performance is actually still a problem afterwards.

Note that I don't have problems with the breaking changes in particular (we already have a lot for paket5) but it might be a problem if methods get an additional parameter where it is not clear (on first sight) why they are needed.

Sorry for the back and forth and clarifying so late. It's only my personal opinion, @forki might have another one and I'm completely fine with merging this - once it's green ;)

gerardtoconnor added some commits Jun 10, 2017

@forki

This comment has been minimized.

Show comment
Hide comment
@forki

forki Jun 11, 2017

Member

thanks!

Member

forki commented Jun 11, 2017

thanks!

@forki forki merged commit 8c8ca42 into fsprojects:master Jun 11, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@gerardtoconnor

This comment has been minimized.

Show comment
Hide comment
@gerardtoconnor

gerardtoconnor Jun 11, 2017

Contributor

@matthid I fully understand on changing function signatures, at the end of the day, if I wanted to deploy a centralised cache, I needed to ensure only one was created (start of pack method) and propagated down to all calls of GetAllReferences, I appreciate my solution is not pretty or Ideal but on the bright side, the ugliness has come at a significant performance boost, @konste tested and it brings pack time down from an 'Eternatiy' to 11 seconds. There is more performance tweaking I can do to improve further but as @forki said, we're probably good for now, 11 secs tolerable. If there is any commenting/ docs you want me to update to reflect this cache being passed around, let me know (it's a memoise more than a cache really).

Contributor

gerardtoconnor commented Jun 11, 2017

@matthid I fully understand on changing function signatures, at the end of the day, if I wanted to deploy a centralised cache, I needed to ensure only one was created (start of pack method) and propagated down to all calls of GetAllReferences, I appreciate my solution is not pretty or Ideal but on the bright side, the ugliness has come at a significant performance boost, @konste tested and it brings pack time down from an 'Eternatiy' to 11 seconds. There is more performance tweaking I can do to improve further but as @forki said, we're probably good for now, 11 secs tolerable. If there is any commenting/ docs you want me to update to reflect this cache being passed around, let me know (it's a memoise more than a cache really).

@matthid

This comment has been minimized.

Show comment
Hide comment
@matthid

matthid Jun 11, 2017

Member

If there is any commenting/ docs you want me to update to reflect this cache being passed around, let me know (it's a memoise more than a cache really).

@gerardtoconnor No everything is good, thanks for taking care of this :)

Member

matthid commented Jun 11, 2017

If there is any commenting/ docs you want me to update to reflect this cache being passed around, let me know (it's a memoise more than a cache really).

@gerardtoconnor No everything is good, thanks for taking care of this :)

@gerardtoconnor

This comment has been minimized.

Show comment
Hide comment
@gerardtoconnor

gerardtoconnor Jun 11, 2017

Contributor

@forki @matthid apologies for my ignorance, but if I want to run the integration tests (on my local windows 10 machine) is there a way? there were just a few things I wanted to tidy and simplify in my caching algo and if I could run tests locally would save annoying Travis etc?

For big edge case projects that have hundred(s) references, is it worth creating an integration test for big projects that can ensure we are not introducing some recursion bugs?

Contributor

gerardtoconnor commented Jun 11, 2017

@forki @matthid apologies for my ignorance, but if I want to run the integration tests (on my local windows 10 machine) is there a way? there were just a few things I wanted to tidy and simplify in my caching algo and if I could run tests locally would save annoying Travis etc?

For big edge case projects that have hundred(s) references, is it worth creating an integration test for big projects that can ensure we are not introducing some recursion bugs?

@matthid

This comment has been minimized.

Show comment
Hide comment
@matthid

matthid Jun 11, 2017

Member

Usually I open the solution in VS build all and run tests from the test runner (including integration tests) or just build.cmd in cmd

note that you might need the nunit test runner for this to work in vs (build.cmd) should work always

Member

matthid commented Jun 11, 2017

Usually I open the solution in VS build all and run tests from the test runner (including integration tests) or just build.cmd in cmd

note that you might need the nunit test runner for this to work in vs (build.cmd) should work always

@forki

This comment has been minimized.

Show comment
Hide comment
@forki

forki Jun 11, 2017

Member

also just calling build.cmd RunIntegrationTests from cmd line should do it.

Member

forki commented Jun 11, 2017

also just calling build.cmd RunIntegrationTests from cmd line should do it.

@gerardtoconnor

This comment has been minimized.

Show comment
Hide comment
@gerardtoconnor

gerardtoconnor Jun 11, 2017

Contributor

Much appreciated gentleman, build cmd working ( I forgot .cmd treating it like exe ) Helpful to know on VS, I had been working in vscode

Contributor

gerardtoconnor commented Jun 11, 2017

Much appreciated gentleman, build cmd working ( I forgot .cmd treating it like exe ) Helpful to know on VS, I had been working in vscode

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment