Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ask Non-Local Grid From Other MPI Rank #26

Closed
cindytsai opened this issue Sep 3, 2021 · 7 comments · Fixed by #31
Closed

Ask Non-Local Grid From Other MPI Rank #26

cindytsai opened this issue Sep 3, 2021 · 7 comments · Fixed by #31
Assignees
Labels
new-feature New feature. pri-low Priority: low testrun Test run.
Projects

Comments

@cindytsai
Copy link
Collaborator

cindytsai commented Sep 3, 2021

Ask Non-Local Grid From Other MPI Rank

  • Possible Solution:
    • libyt get non local grids from other rank, and pass it back to yt, just like how derived fields did.
    • Reference
    • Code template
@cindytsai cindytsai self-assigned this Sep 3, 2021
@cindytsai cindytsai added enhancement New feature or request pri-low Priority: low labels Sep 3, 2021
@cindytsai cindytsai added this to To do in libyt-v0.1 via automation Sep 3, 2021
@cindytsai cindytsai changed the title Ask Non-Local From Other MPI Rank Ask Non-Local Grid From Other MPI Rank Sep 4, 2021
@cindytsai cindytsai added new-feature New feature. and removed enhancement New feature or request labels Sep 8, 2021
@cindytsai cindytsai moved this from To do to In progress in libyt-v0.1 Sep 8, 2021
libyt-v0.1 automation moved this from In progress to Done Sep 14, 2021
@cindytsai cindytsai reopened this Sep 14, 2021
libyt-v0.1 automation moved this from Done to In progress Sep 14, 2021
@cindytsai
Copy link
Collaborator Author

cindytsai commented Sep 15, 2021

Solution

Because window creation in MPI RMA operation is a collective operation, every rank must be in the same state. Which means we have to enter this state at the very beginning of the IO. ( I didn't consider load balancing. But if we really need to, we only need to make sure every rank goes to the same function that will enter the same state.)
To do this,

Distinguish local/non-local grids.

See cindytsai/yt branch libyt-NonLocal.

At libyt yt frontend, gather all the non-local grids at each rank's perspective.

See cindytsai/yt branch libyt-NonLocal.

RMA operation in C extended python method.

  • Implement all the method in class.
  • Some data member may be redundant, remove them.
  • Modelize this get remote grid method in libyt frontend io.
  • Tuning RMA, set info and assert in RMA.
    • MPI_Win_fence
    • MPI_Win_create_dynamic
  • Modelize MPI_Gatherv and MPI_Bcast deal with big send count in yt_commit_grids.cpp.
    • Apply to yt_rma class gather_all_prepare_data.
  • Rename yt_rma class to yt_rma_field, and construct yt_rma_particle.
    • Test run on particle data.
      • What if particle count is 0.
    • Test run on field data.
  • Log messages.
  • Sync field to get and particle to get before calling C extension(?)
    • It doesn't seem to need this, they are symmetry.
    • Use sorted when getting fname_list and ptf_c dictionary, so that each fields to-get are unique.
  • Do not call back to c extension when there are no remote grids needed to get.
  • Filter out those who really has particles in their grid.
  • Set log msg in libyt yt frontend if using yt_rma_field or yt_rma_particle.
  • Extended C++ method should be able to respond failure return from yt_rma if anything bad really happens.
  • Support MPI_Get for big sendcount and receive count.
    • Haven't encounter so not tested yet.

Collect the field and return.

Reference

Test Run

  • yt functionality (These functions did not support parallelism before we support getting non-local grids)
  • Field Data
    • cell-centered
    • face-centered
    • derived_func
  • Particle Data

TODO

  • Check lifetime of data members in field_list and particle_list, they should exist till the very end.
    • User input field_list and particle_list should exist till the end of the inline-analysis.
  • Reference counting in Python, remember to free the unused reference.
  • Check if I freed every allocated data.
  • Will there be ranks that have no grids to read.
  • What if particle attributes eat up all the memory. Have not encounter yet.

@hyschive
Copy link
Contributor

@cindytsai I like this approach. We can easily distinguish local and non-local grids when collecting the complete AMR structure.

@cindytsai
Copy link
Collaborator Author

cindytsai commented Oct 9, 2021

Test Run

Volume Rendering

  • Test Problem: 3D Blast Wave.
  • Inline script:
    import yt
    yt.enable_parallelism()
    def yt_inline():
          ds = yt.frontends.libyt.libytDataset()
          sc = yt.create_scene(ds, lens_type="perspective")
          source = sc[0]
          source.tfh.set_log(True)
          source.tfh.grey_opacity = False
          source.tfh.plot("transfer_function.png", profile_field=("gas", "density"))
              
          sc.save("rendering.png", sigma_clip=4.0)
    • End with same error as in post-processing.
      • Inline-Analysis
        Screenshot from 2021-10-09 12-46-46
      • Post-Processing
        Screenshot from 2021-10-09 12-44-25
    • yt-project issue A work around method will be set the log level to info. Issue Closed.
    • Run the script while turning debug message off. (They have fixed the issue.)
      • Post-Processing
        • MPI=1,2: Give the correct result.
          rendering-NoSetBounds
      • Inline
        • MPI=2: (Ignore the zoom in and the color transfer function.)
          rendering_2
        • MPI=3, 5 (odd number): Stuck and failed when saving rendering image.
        • MPI=4: Success.
  • Inline script (Failed):
    import yt
    yt.enable_parallelism()
    def yt_inline():
          ds = yt.frontends.libyt.libytDataset()
          sc = yt.create_scene(ds, lens_type="perspective")
          source = sc[0]
          source.tfh.set_log(True)
          source.tfh.grey_opacity = False
          source.tfh.plot("transfer_function.png", profile_field=("gas", "density"))
          if yt.is_root():
              sc.save("rendering.png", sigma_clip=4.0)
    • Not all ranks call for the _read_fluid_selection, because there is a if clause yt.is_root() when saving figure.

@cindytsai cindytsai mentioned this issue Oct 9, 2021
11 tasks
@cindytsai
Copy link
Collaborator Author

cindytsai commented Oct 15, 2021

Test Run

OffAxisProjectionPlot

  • Test Problem: 3D Blast Wave.
  • Inline script: (Again, this should turn the log level to info.)
    import yt
    yt.enable_parallelism()
    yt.set_log_level("info")
    def yt_inline():
        ds = yt.frontends.libyt.libytDataset()
        L = [1, 1, 0]
        north_vector = [-1, 1, 0]
        prj = yt.OffAxisProjectionPlot(ds, L, ("gas", "density"), north_vector=north_vector)
        if yt.is_root():
            prj.save()

Fig000000001_OffAxisProjection_density

@cindytsai
Copy link
Collaborator Author

cindytsai commented Oct 15, 2021

Test Run

OffAxisSlicePlot

  • Test Problem: 3D Blast Wave
  • Inline script:
    import yt
    yt.enable_parallelism()  
    def yt_inline():
        ds = yt.frontends.libyt.libytDataset()
        L = [1, 1, 0]
        north_vector = [-1, 1, 0]
        cut = yt.SlicePlot(ds, L, ("gas", "density"), north_vector=north_vector, center=[0.5, 0.5, 0.5])
        if yt.is_root():
            cut.save()

Fig000000002_OffAxisSlice_density

@cindytsai
Copy link
Collaborator Author

cindytsai commented Oct 18, 2021

Test Run

ParticlePlot

  • Test Problem: gamer Plummer
  • Inline Script: Both of them failed.
    import yt
    yt.enable_parallelism()
    def yt_inline():
        ds = yt.frontends.libyt.libytDataset()
        par = yt.ParticlePlot( ds, 'particle_position_x', 'particle_position_y', ("io", "particle_mass"), center='c' )
        if yt.is_root():
            par.save()
    import yt
    yt.enable_parallelism()
    def yt_inline():
        ds = yt.frontends.libyt.libytDataset()
        par = yt.ParticlePlot( ds, 'particle_position_x', 'particle_position_y', ("io", "particle_mass"), center='c' )
        par.save()

Output

Fig000000000_Particle_z_particle_mass

Expected Output

Fig000000000_Particle_z_particle_mass

@cindytsai
Copy link
Collaborator Author

cindytsai commented Oct 18, 2021

Test Run

ParticleProjectionPlot

  • Test Problem: gamer Plummer
  • Inline Script: Both of them failed.
    import yt
    yt.enable_parallelism()
    def yt_inline():
        ds = yt.frontends.libyt.libytDataset()
        par_prj = yt.ParticleProjectionPlot( ds, "z" )
        if yt.is_root():
            par.save()
    import yt
    yt.enable_parallelism()
    def yt_inline():
        ds = yt.frontends.libyt.libytDataset()
        par_prj = yt.ParticleProjectionPlot( ds, "z" )
        par.save()

MPI = 2

Fig000000000_Particle_z_particle_ones

MPI = 1 (Expected)

Fig000000000_Particle_z_particle_ones

@cindytsai cindytsai mentioned this issue Oct 18, 2021
8 tasks
@cindytsai cindytsai linked a pull request Oct 20, 2021 that will close this issue
libyt-v0.1 automation moved this from In progress to Done Oct 20, 2021
@cindytsai cindytsai added the testrun Test run. label Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-feature New feature. pri-low Priority: low testrun Test run.
Projects
Development

Successfully merging a pull request may close this issue.

2 participants