Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source code decoupling #5

Open
daskol opened this issue Sep 1, 2017 · 3 comments
Open

Source code decoupling #5

daskol opened this issue Sep 1, 2017 · 3 comments

Comments

@daskol
Copy link
Contributor

daskol commented Sep 1, 2017

It would be better to restruct code and publish a kind of interface in order to everyone can use it as dependency. There are several issues.

  1. Separate entry point in order to use deepwalk in different contexts.
  2. Provide option to compile without OpenMP. Decoration of omp_get_thread_num() call with _OPENMP macro wil be nice.
  3. Add some tests. For example, death test can help to detect wrong memory writes.
  4. There is shared inner state that prevent usage as is in concurrent runtime.

The first two points are the most crucial for me since I would like to use it in side project.

@xgfs
Copy link
Owner

xgfs commented Sep 1, 2017

In the arbitrary order:

  1. Doing that will ruin portability for Windows, as there is not -openmp-simd flag that I would have used otherwise on gcc, clang and icc. Besides, what is wrong with openmp dependency? It is supported by literally any compiler nowadays.
  2. I am not sure if this is entirely possible to decouple rng state without serious performance degradation. The code is optimized for speed now, it is not really worth to lose it.
  3. After the fix, there seems no memory errors on the graphs I have. It is pretty unreasonable to test other functions than hsm.
  4. It is related to [4]. Until [4] is done, I see little reason for that.

I just do not see the benefits of [3], and [4,1] seem to have little impact on other users.

@daskol
Copy link
Contributor Author

daskol commented Sep 1, 2017

Well I see. It is up to you.

  1. The reason that it is. OpenMP sometimes makes debugging much harder and inject external dependencies.
  2. Nevertheless it is possible even though you have not tried or won't try.
  3. Memory errors are only one side the other is correctness. How do you prove that implementation unerring? By the way, what testing interface should be in master mean here Wip/memory corruption #4?
  4. I misunderstand this point totally.

What kind of users are you talking about? I am user. It have large impact on me.

@xgfs
Copy link
Owner

xgfs commented Sep 1, 2017

  1. It seems like OpenMP is supported by most sensible and few non-sensible compilers. I would not really agree that OpenMP is too much of an external dependency as so many compilers support it natively. Going for pthreads would break Windows support, going std::parallel would break any compiler not happy with c++17. I did not find the proper way to fully disable openmp without losing FMA speedup on the update loop: maybe there is. If you find one, I will be happy to merge.

  2. I did try to abstract the RNG, the performance dropped significantly. There is a reason things are, even in the RNG choice. Yes, using something like taus88 would be more "clean", yet the program loses in speed.

  3. I have verified the performance on the classification tasks on the datasets from the original paper (download link). Wip/memory corruption #4 is not correct.

[5] Was referring to [1]. The intended usage of the tool is to produce an embedding of a graph. I di not really see how coupled code hinders the functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants