Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Astropy Units with Dask distributed #11317

Open
AlecThomson opened this issue Feb 10, 2021 · 6 comments
Open

Astropy Units with Dask distributed #11317

AlecThomson opened this issue Feb 10, 2021 · 6 comments
Labels

Comments

@AlecThomson
Copy link

Description

Hi there, I'm using dask to scale some work I'm doing. A small step includes some astropy unit conversions. This works fine when using distributed.LocalCluster for tests, but I'm getting some unexpected errors when I scale to use dask_jobqueue.SLURMCluster. I'm not sure if the issue lies inside of Dask or astropy units, but I wanted to see if there was something I was missing, or some edge-case that might be cropping up.

Expected behavior

I'm adding/converting arcseconds and degrees, which should convert.

Actual behavior

I get the following error when deployed on the Slurm cluster:

Exception: UnitConversionError("'arcsec' (angle) and 'deg' (angle) are not convertible")

Steps to Reproduce

This is the closest to a basic version of the script I've been using. Frustratingly, though, this demo hasn't been reproducing the same error for me.

#!/usr/bin/env python
import numpy as np
from dask import delayed
from dask_jobqueue import SLURMCluster
from distributed import Client, progress, performance_report, LocalCluster
from dask.diagnostics import ProgressBar
import astropy.units as u
import time

@delayed
def add(x):
    y = 1*u.deg
    out = x + y
    # Mimic some other work
    time.sleep(10)
    return out

def main(client, verbose=True):
    xs = np.ones(4000)*u.arcsec

    outputs = []
    for x in xs:
        output = add(x)
        outputs.append(output)

    results = client.persist(outputs)
    if verbose:
        print("Doing work...")
    progress(results)

    if verbose:
        print('Done!')

def cli():
    # Leaving at default, but I configure it for my cluster in my scripts
    cluster = SLURMCluster()
    # Request up to 25 nodes
    cluster.adapt(minimum=0, maximum=25)
    
    client = Client(cluster)
    
    main(client)

if __name__ == "__main__":
    cli()

System Details

Linux-4.4.180-94.130-default-x86_64-with-SuSE-12-x86_64
Python 3.7.6 (default, Jan  8 2020, 19:59:22) 
[GCC 7.3.0]
Numpy 1.18.1
astropy 4.2
Scipy 1.4.1
Matplotlib 3.1.3
@pllim pllim added the units label Feb 10, 2021
@pllim
Copy link
Member

pllim commented Feb 10, 2021

@AlecThomson , since your example is reportedly not actually giving you the error you are reporting, can you at least share the full traceback? Thanks!

@mhvk
Copy link
Contributor

mhvk commented Feb 10, 2021

Also helpful would be the astropy version. It could be related to race conditions in defining units, which we only recently solved... If possible, do try with latest master.

@AlecThomson
Copy link
Author

@pllim in the output logs, it only prints the Error I put above, and not the full traceback. Although, I also just found that this (pretty dumb) change 'fixes' the issue is effectively:

def add(x):
    x = x.value*u.arcsec
    y = 1*u.deg
    out = x + y
    # Mimic some other work
    time.sleep(10)
    return out

@mhvk that sounds like the kind of issue I was worried about. Is that available via conda? Or, will I need to install from the git repo? Thanks!

@adrn
Copy link
Member

adrn commented Feb 15, 2021

Hi @AlecThomson - you can install the latest development version either by cloning the repo and installing locally, e.g.:

git clone git://github.com/astropy/astropy.git
cd astropy
pip install -e .

Or you can try installing with pip from github directly:

pip install git+https://github.com/astropy/astropy

@AlecThomson
Copy link
Author

Thanks @adrn. I just tested with astropy 4.3.dev595+gced64b965, and I still got the same error.

As a clarification, my worker function is actually more like

@delayed
def add(x):
    y = 1*u.deg
    out = x - y
    # Mimic some other work
    time.sleep(10)
    return out

And raises error

Exception: UnitConversionError("Can only apply 'subtract' function to quantities with compatible dimensions")

The error in my original post occurs with something like:

@delayed
def add(x):
    y = 1*u.deg
    x = x.to(u.deg)
    out = x - y
    # Mimic some other work
    time.sleep(10)
    return out

But the underlying issue still seems to be the same, I think

@pllim
Copy link
Member

pllim commented Feb 17, 2021

Sounds fishy. Did you try print out what x actually is before it crashed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants