Run tests against numpy 2 #256

jeromekelleher · 2024-06-19T10:50:17Z

Some errors:

tests/test_vcf_examples.py:846:                                                                                       
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
bio2zarr/vcf2zarr/vcz.py:1040: in convert                                                                             
    icf.explode(                                                                                                      
bio2zarr/vcf2zarr/icf.py:1188: in explode                                                                             
    writer.explode(worker_processes=worker_processes, show_progress=show_progress)                                    
bio2zarr/vcf2zarr/icf.py:1130: in explode
    pwm.submit(self.process_partition, j)                                                                             
bio2zarr/core.py:291: in submit                                                                                       
    future = self.executor.submit(*args, **kwargs)                                                                    
bio2zarr/core.py:90: in submit                                                                                        
    future.set_result(fn(*args, **kwargs))             
bio2zarr/vcf2zarr/icf.py:1071: in process_partition
    tcw.append("POS", variant.POS)                
bio2zarr/vcf2zarr/icf.py:806: in append
    self.field_writers[name].append(value)
bio2zarr/vcf2zarr/icf.py:724: in append            
    val = self.transformer.transform_and_update_bounds(val)                                                            
bio2zarr/vcf2zarr/icf.py:503: in transform_and_update_bounds                                                           
    value = self.transform(vcf_value)     
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _  
                                                                                                                      
self = <bio2zarr.vcf2zarr.icf.IntegerValueTransformer object at 0x7f1d07053160>, vcf_value = 111
                                                           
    def transform(self, vcf_value):                                                                                   
        if isinstance(vcf_value, tuple):
            vcf_value = [self.missing if v is None else v for v in vcf_value]                   
>       value = np.array(vcf_value, ndmin=self.dimension, copy=False)

The text was updated successfully, but these errors were encountered:

jeromekelleher · 2024-06-19T10:59:46Z

Fixing that problem is easy enough with setting copy=None. However, we get something more sinister later in the process:

(numpy-2-venv) jk@empire$ python3 -m bio2zarr vcf2zarr explode tests/data/vcf/sample.vcf.gz tmp/x.icf -p0
    Scan: 100%|█████████████████████████████████████████████████████████████████| 1.00/1.00 [00:00<00:00, 52.7files/s]
 Explode: 100%|███████████████████████████████████████████████████████████████████| 9.00/9.00 [00:00<00:00, 429vars/s]
(numpy-2-venv) jk@empire$ python3 -m bio2zarr vcf2zarr encode tmp/x.icf tmp/x.vcz -f -p0
  Encode:  85%|███████████████████████████████████████████████████████████▊          | 792/927 [00:00<00:00, 7.81kB/s]Segmentation fault (core dumped)
(numpy-2-venv) jk@empire$ /usr/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

So, during the encode step we've got a segfault. Digging in to this now.

jeromekelleher · 2024-06-19T11:07:40Z

Hmm, so setting copy=True above resolves this segfault (which appears to happen when accessing the genotype data). So, setting copy=True seems fine and dandy to me - I'm sure the perf difference is negligible.

jeromekelleher · 2024-06-19T12:00:35Z

With #257 we should be basically set for numpy 2.0 and numpy 1.x compatibility. To close this issue we should add a CI job that explicitly installs numpy 2.x and runs the tests. Later, when numpy 2.0 becomes the default thing we install (due to dependencies) we can switch this to 1.x.

We're waiting on numpy 2.0 compatible wheels from msprime, so no point in making this CI job until they arrive.

jeromekelleher · 2024-06-25T14:48:42Z

Just waiting on numpy 2.0 wheels for msprime which should arrive in a few days, and we can then ship a numpy 2.0 compatible version.

jeromekelleher mentioned this issue Jun 19, 2024

Set copy=True in np.array creation for numpy 2.0 compatibility #257

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run tests against numpy 2 #256

Run tests against numpy 2 #256

jeromekelleher commented Jun 19, 2024

jeromekelleher commented Jun 19, 2024

jeromekelleher commented Jun 19, 2024

jeromekelleher commented Jun 19, 2024

jeromekelleher commented Jun 25, 2024

Run tests against numpy 2 #256

Run tests against numpy 2 #256

Comments

jeromekelleher commented Jun 19, 2024

jeromekelleher commented Jun 19, 2024

jeromekelleher commented Jun 19, 2024

jeromekelleher commented Jun 19, 2024

jeromekelleher commented Jun 25, 2024