Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate Access of Pixel Data #4

Open
acbecker opened this issue Dec 9, 2014 · 7 comments
Open

Investigate Access of Pixel Data #4

acbecker opened this issue Dec 9, 2014 · 7 comments

Comments

@acbecker
Copy link
Contributor

acbecker commented Dec 9, 2014

There are a couple of options to accessing the actual pixel data:

  • Leave the images on disk and access them outside the database
  • Ingest the images as BLOBS
  • Ingest each pixel as a row in the database

The latter issue might be problematic as there are 10^12 pixels and ingest (and index building and clustering) could take a substantial amount of time (CPU years). A parallel database might help solve this.

I will start this investigation by looking at how to turn the ra,decl of a Trajectory/Image intersection into a pixel address within the image. This can happen

  • In the database using our UDF
  • Outside the database using the WCS transformation that comes with the .fits file
@acbecker
Copy link
Contributor Author

acbecker commented Dec 9, 2014

@dhalperi Here is a query that builds off of our previous work intersecting trajectories with runs and then images. It uses the UDF we put together earlier this quarter.


WITH TI AS (
    WITH TR AS (
        SELECT
        traj.trajId as trajId, 
        traj.ra0 as ra0,
        traj.dec0 as dec0,
        traj.t0 as t0,
        traj.delta_ra as delta_ra,
        traj.delta_dec as delta_dec,
        ST_MakeLine(ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_ra, 
                                 traj.dec0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_dec,
                                 EXTRACT(EPOCH FROM run.tmin)), 
                    ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_ra, 
                                 traj.dec0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_dec,
                                 EXTRACT(EPOCH FROM run.tmax))) as tline
        FROM 
            Trajectory as traj,
            Run as run
        WHERE
            run.bbox && 
            ST_MakeLine(ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_ra, 
                                     traj.dec0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_dec),
                        ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_ra, 
                                     traj.dec0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_dec))
        AND
            ABS(EXTRACT(EPOCH FROM run.tmax-traj.t0)) < 2592000
    )
    SELECT 
        im.imageId, TR.trajId,
        (TR.ra0  + EXTRACT(EPOCH FROM im.tmid-TR.t0) * TR.delta_ra) as raInt, 
        (TR.dec0 + EXTRACT(EPOCH FROM im.tmid-TR.t0) * TR.delta_dec) as decInt
    FROM
        Image as im,
        TR
    WHERE
        im.bbox3d &&& TR.tline
)
SELECT 
    TI.imageId, TI.trajId, TI.raInt, TI.decInt,
    c_skyToIdx(w, TI.raInt, TI.decInt)
FROM
    Wcs as w,
    TI
WHERE
    w.imageId = TI.imageId
;

Will check timings next.

@dhalperi
Copy link
Member

I'm pretty sure ingesting pixels will never be worth it. So if we go the
udf route we want the udf to take in ra, dec, and the image and give us the
confidence directly.


Daniel Halperin
Director of Research for Scalable Data Analytics
eScience Institute
University of Washington

On Tue, Dec 9, 2014 at 3:06 PM, Andy Becker notifications@github.com
wrote:

@dhalperi https://github.com/dhalperi Here is a query that builds off
of our previous work intersecting trajectories with runs and then images.
It uses the UDF we put together earlier this quarter.

WITH TI AS (
WITH TR AS (
SELECT
traj.trajId as trajId,
traj.ra0 as ra0,
traj.dec0 as dec0,
traj.t0 as t0,
traj.delta_ra as delta_ra,
traj.delta_dec as delta_dec,
ST_MakeLine(ST_MakePoint(traj.ra0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_ra,
traj.dec0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_dec,
EXTRACT(EPOCH FROM run.tmin)),
ST_MakePoint(traj.ra0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_ra,
traj.dec0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_dec,
EXTRACT(EPOCH FROM run.tmax))) as tline
FROM
Trajectory as traj,
Run as run
WHERE
run.bbox &&
ST_MakeLine(ST_MakePoint(traj.ra0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_ra,
traj.dec0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_dec),
ST_MakePoint(traj.ra0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_ra,
traj.dec0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_dec))
AND
ABS(EXTRACT(EPOCH FROM run.tmax-traj.t0)) < 2592000
)
SELECT
im.imageId, TR.trajId,
(TR.ra0 + EXTRACT(EPOCH FROM im.tmid-TR.t0) * TR.delta_ra) as raInt,
(TR.dec0 + EXTRACT(EPOCH FROM im.tmid-TR.t0) * TR.delta_dec) as decInt
FROM
Image as im,
TR
WHERE
im.bbox3d &&& TR.tline
)
SELECT
TI.imageId, TI.trajId, TI.raInt, TI.decInt,
c_skyToIdx(w, TI.raInt, TI.decInt)
FROM
Wcs as w,
TI
WHERE
w.imageId = TI.imageId
;

Will check timings next.


Reply to this email directly or view it on GitHub
#4 (comment).

@acbecker
Copy link
Contributor Author

Getting strange results on these timings. Here are a couple versions to see where the time is being taken up:

  • Column 1: Full query

    
    ...
    SELECT 
    TI.imageId, TI.trajId, TI.raInt, TI.decInt,
    c_skyToIdx(w, TI.raInt, TI.decInt)
    FROM
    Wcs as w,
    TI
    WHERE
    w.imageId = TI.imageId
    
  • Column 2: Just a join with the Wcs table

    
    ...
    SELECT 
    TI.imageId, TI.trajId, TI.raInt, TI.decInt, w.wcsId
    FROM
    Wcs as w,
    TI
    WHERE
    w.imageId = TI.imageId
    
  • Column 3: Just making the 2 temporary tables instead of 1 previously

    
    ...
    SELECT 
    TI.imageId, TI.trajId, TI.raInt, TI.decInt
    FROM
    TI
    

1   8.455 +/- 1.179    1.567 +/- 0.067   0.053 +/- 0.033
5   11.691 +/- 1.274   1.638 +/- 0.049   0.155 +/- 0.075
10  12.752 +/- 0.400   1.666 +/- 0.026   0.188 +/- 0.060
50  12.263 +/- 0.295   1.813 +/- 0.080   0.293 +/- 0.043
100 12.771 +/- 0.642   1.966 +/- 0.045   0.473 +/- 0.056
500 21.576 +/- 1.868   3.664 +/- 0.144   2.113 +/- 0.076

@dhalperi
Copy link
Member

can you post the explains?


Daniel Halperin
Director of Research for Scalable Data Analytics
eScience Institute
University of Washington

On Wed, Dec 10, 2014 at 10:28 AM, Andy Becker notifications@github.com
wrote:

Getting strange results on these timings. Here are a couple versions to
see where the time is being taken up:

Column 1: Full query

...
SELECT
TI.imageId, TI.trajId, TI.raInt, TI.decInt,
c_skyToIdx(w, TI.raInt, TI.decInt)
FROM
Wcs as w,
TI
WHERE
w.imageId = TI.imageId

-

Column 2: Just a join with the Wcs table

...
SELECT
TI.imageId, TI.trajId, TI.raInt, TI.decInt, w.wcsId
FROM
Wcs as w,
TI
WHERE
w.imageId = TI.imageId

-

Column 3: Just making the 2 temporary tables instead of 1 previously

...
SELECT
TI.imageId, TI.trajId, TI.raInt, TI.decInt
FROM
TI

1 8.455 +/- 1.179 1.567 +/- 0.067 0.053 +/- 0.033
5 11.691 +/- 1.274 1.638 +/- 0.049 0.155 +/- 0.075
10 12.752 +/- 0.400 1.666 +/- 0.026 0.188 +/- 0.060
50 12.263 +/- 0.295 1.813 +/- 0.080 0.293 +/- 0.043
100 12.771 +/- 0.642 1.966 +/- 0.045 0.473 +/- 0.056
500 21.576 +/- 1.868 3.664 +/- 0.144 2.113 +/- 0.076


Reply to this email directly or view it on GitHub
#4 (comment).

@acbecker
Copy link
Contributor Author

Q1:


EXPLAIN ANALYZE
WITH TI AS (
    WITH TR AS (
        SELECT
        traj.trajId as trajId, 
        traj.ra0 as ra0,
        traj.dec0 as dec0,
        traj.t0 as t0,
        traj.delta_ra as delta_ra,
        traj.delta_dec as delta_dec,
        ST_MakeLine(ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_ra, 
                                 traj.dec0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_dec,
                                 EXTRACT(EPOCH FROM run.tmin)), 
                    ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_ra, 
                                 traj.dec0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_dec,
                                 EXTRACT(EPOCH FROM run.tmax))) as tline
        FROM 
            Trajectory as traj,
            Run as run
        WHERE
            run.bbox && 
            ST_MakeLine(ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_ra, 
                                     traj.dec0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_dec),
                        ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_ra, 
                                     traj.dec0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_dec))
        AND
            ABS(EXTRACT(EPOCH FROM run.tmax-traj.t0)) < 2592000
    )
    SELECT 
        im.imageId, TR.trajId,
        (TR.ra0  + EXTRACT(EPOCH FROM im.tmid-TR.t0) * TR.delta_ra) as raInt, 
        (TR.dec0 + EXTRACT(EPOCH FROM im.tmid-TR.t0) * TR.delta_dec) as decInt
    FROM
        Image as im,
        TR
    WHERE
        im.bbox3d &&& TR.tline
)
SELECT 
    TI.imageId, TI.trajId, TI.raInt, TI.decInt,
    c_skyToIdx(w, TI.raInt, TI.decInt)
FROM
    Wcs as w,
    TI
WHERE
    w.imageId = TI.imageId
;
                                                                                                                                       
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=50539.21..402698.30 rows=102724 width=468) (actual time=88.530..11818.871 rows=16893 loops=1)
   CTE ti
     ->  Nested Loop  (cost=10835.85..50538.78 rows=102724 width=64) (actual time=20.446..1803.417 rows=16893 loops=1)
           CTE tr
             ->  Nested Loop  (cost=0.00..10825.85 rows=50 width=64) (actual time=20.261..392.633 rows=8831 loops=1)
                   Join Filter: ((abs(date_part('epoch'::text, (run.tmax - traj.t0))) < 2592000::double precision) AND (run.bbox && st_makeline(st_makepoint((traj.ra0 +
 (date_part('epoch'::text, (run.tmin - traj.t0)) * traj.delta_ra)), (traj.dec0 + (date_part('epoch'::text, (run.tmin - traj.t0)) * traj.delta_dec))), st_makepoint((traj
.ra0 + (date_part('epoch'::text, (run.tmax - traj.t0)) * traj.delta_ra)), (traj.dec0 + (date_part('epoch'::text, (run.tmax - traj.t0)) * traj.delta_dec))))))
                   Rows Removed by Join Filter: 140169
                   ->  Seq Scan on trajectory traj  (cost=0.00..10.00 rows=500 width=48) (actual time=0.009..0.383 rows=500 loops=1)
                   ->  Materialize  (cost=0.00..11.47 rows=298 width=136) (actual time=0.000..0.113 rows=298 loops=500)
                         ->  Seq Scan on run  (cost=0.00..9.98 rows=298 width=136) (actual time=0.007..0.155 rows=298 loops=1)
           ->  CTE Scan on tr  (cost=0.00..1.00 rows=50 width=80) (actual time=20.263..408.706 rows=8831 loops=1)
           ->  Bitmap Heap Scan on image im  (cost=10.00..751.10 rows=205 width=104) (actual time=0.136..0.145 rows=2 loops=8831)
                 Recheck Cond: (bbox3d &&& tr.tline)
                 ->  Bitmap Index Scan on image_bbox3d_idx  (cost=0.00..9.95 rows=205 width=0) (actual time=0.129..0.129 rows=2 loops=8831)
                       Index Cond: (bbox3d &&& tr.tline)
   ->  CTE Scan on ti  (cost=0.00..2054.48 rows=102724 width=32) (actual time=20.451..1828.328 rows=16893 loops=1)
   ->  Index Scan using wcs_imageid_idx on wcs w  (cost=0.43..3.40 rows=1 width=444) (actual time=0.258..0.261 rows=1 loops=16893)
         Index Cond: (imageid = ti.imageid)
 Total runtime: 11826.657 ms
(19 rows)

@acbecker
Copy link
Contributor Author

Q2:


EXPLAIN ANALYZE
WITH TI AS (
    WITH TR AS (
        SELECT
        traj.trajId as trajId, 
        traj.ra0 as ra0,
        traj.dec0 as dec0,
        traj.t0 as t0,
        traj.delta_ra as delta_ra,
        traj.delta_dec as delta_dec,
        ST_MakeLine(ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_ra, 
                                 traj.dec0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_dec,
                                 EXTRACT(EPOCH FROM run.tmin)), 
                    ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_ra, 
                                 traj.dec0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_dec,
                                 EXTRACT(EPOCH FROM run.tmax))) as tline
        FROM 
            Trajectory as traj,
            Run as run
        WHERE
            run.bbox && 
            ST_MakeLine(ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_ra, 
                                     traj.dec0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_dec),
                        ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_ra, 
                                     traj.dec0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_dec))
        AND
            ABS(EXTRACT(EPOCH FROM run.tmax-traj.t0)) < 2592000
    )
    SELECT 
        im.imageId, TR.trajId,
        (TR.ra0  + EXTRACT(EPOCH FROM im.tmid-TR.t0) * TR.delta_ra) as raInt, 
        (TR.dec0 + EXTRACT(EPOCH FROM im.tmid-TR.t0) * TR.delta_dec) as decInt
    FROM
        Image as im,
        TR
    WHERE
        im.bbox3d &&& TR.tline
)
SELECT 
    TI.imageId, TI.trajId, TI.raInt, TI.decInt, w.wcsId
FROM
    Wcs as w,
    TI
WHERE
    w.imageId = TI.imageId
;
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=206511.71..210877.48 rows=102724 width=36) (actual time=2961.697..5019.968 rows=16893 loops=1)
   Hash Cond: (ti.imageid = w.imageid)
   CTE ti
     ->  Nested Loop  (cost=10835.85..50538.78 rows=102724 width=64) (actual time=0.250..1946.971 rows=16893 loops=1)
           CTE tr
             ->  Nested Loop  (cost=0.00..10825.85 rows=50 width=64) (actual time=0.138..341.822 rows=8831 loops=1)
                   Join Filter: ((abs(date_part('epoch'::text, (run.tmax - traj.t0))) < 2592000::double precision) AND (run.bbox && st_makeline(st_makepoint((traj.ra0 +
 (date_part('epoch'::text, (run.tmin - traj.t0)) * traj.delta_ra)), (traj.dec0 + (date_part('epoch'::text, (run.tmin - traj.t0)) * traj.delta_dec))), st_makepoint((traj
.ra0 + (date_part('epoch'::text, (run.tmax - traj.t0)) * traj.delta_ra)), (traj.dec0 + (date_part('epoch'::text, (run.tmax - traj.t0)) * traj.delta_dec))))))
                   Rows Removed by Join Filter: 140169
                   ->  Seq Scan on trajectory traj  (cost=0.00..10.00 rows=500 width=48) (actual time=0.006..0.282 rows=500 loops=1)
                   ->  Materialize  (cost=0.00..11.47 rows=298 width=136) (actual time=0.000..0.104 rows=298 loops=500)
                         ->  Seq Scan on run  (cost=0.00..9.98 rows=298 width=136) (actual time=0.004..0.110 rows=298 loops=1)
           ->  CTE Scan on tr  (cost=0.00..1.00 rows=50 width=80) (actual time=0.139..379.809 rows=8831 loops=1)
           ->  Bitmap Heap Scan on image im  (cost=10.00..751.10 rows=205 width=104) (actual time=0.166..0.168 rows=2 loops=8831)
                 Recheck Cond: (bbox3d &&& tr.tline)
                 ->  Bitmap Index Scan on image_bbox3d_idx  (cost=0.00..9.95 rows=205 width=0) (actual time=0.160..0.160 rows=2 loops=8831)
                       Index Cond: (bbox3d &&& tr.tline)
   ->  CTE Scan on ti  (cost=0.00..2054.48 rows=102724 width=32) (actual time=0.252..1993.615 rows=16893 loops=1)
   ->  Hash  (cost=130995.19..130995.19 rows=1998219 width=12) (actual time=2960.485..2960.485 rows=1998219 loops=1)
         Buckets: 262144  Batches: 1  Memory Usage: 93667kB
         ->  Seq Scan on wcs w  (cost=0.00..130995.19 rows=1998219 width=12) (actual time=0.084..1546.014 rows=1998219 loops=1)
 Total runtime: 5063.351 ms
(21 rows)

@acbecker
Copy link
Contributor Author

Q3:


EXPLAIN ANALYZE
WITH TI AS (
    WITH TR AS (
        SELECT
        traj.trajId as trajId, 
        traj.ra0 as ra0,
        traj.dec0 as dec0,
        traj.t0 as t0,
        traj.delta_ra as delta_ra,
        traj.delta_dec as delta_dec,
        ST_MakeLine(ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_ra, 
                                 traj.dec0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_dec,
                                 EXTRACT(EPOCH FROM run.tmin)), 
                    ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_ra, 
                                 traj.dec0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_dec,
                                 EXTRACT(EPOCH FROM run.tmax))) as tline
        FROM 
            Trajectory as traj,
            Run as run
        WHERE
            run.bbox && 
            ST_MakeLine(ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_ra, 
                                     traj.dec0 + EXTRACT(EPOCH FROM run.tmin-traj.t0) * traj.delta_dec),
                        ST_MakePoint(traj.ra0  + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_ra, 
                                     traj.dec0 + EXTRACT(EPOCH FROM run.tmax-traj.t0) * traj.delta_dec))
        AND
            ABS(EXTRACT(EPOCH FROM run.tmax-traj.t0)) < 2592000
    )
    SELECT 
        im.imageId, TR.trajId,
        (TR.ra0  + EXTRACT(EPOCH FROM im.tmid-TR.t0) * TR.delta_ra) as raInt, 
        (TR.dec0 + EXTRACT(EPOCH FROM im.tmid-TR.t0) * TR.delta_dec) as decInt
    FROM
        Image as im,
        TR
    WHERE
        im.bbox3d &&& TR.tline
)
SELECT 
    TI.imageId, TI.trajId, TI.raInt, TI.decInt
FROM
    TI
;
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on ti  (cost=50538.78..52593.26 rows=102724 width=32) (actual time=0.254..2013.158 rows=16893 loops=1)
   CTE ti
     ->  Nested Loop  (cost=10835.85..50538.78 rows=102724 width=64) (actual time=0.251..1983.392 rows=16893 loops=1)
           CTE tr
             ->  Nested Loop  (cost=0.00..10825.85 rows=50 width=64) (actual time=0.142..372.300 rows=8831 loops=1)
                   Join Filter: ((abs(date_part('epoch'::text, (run.tmax - traj.t0))) < 2592000::double precision) AND (run.bbox && st_makeline(st_makepoint((traj.ra0 +
 (date_part('epoch'::text, (run.tmin - traj.t0)) * traj.delta_ra)), (traj.dec0 + (date_part('epoch'::text, (run.tmin - traj.t0)) * traj.delta_dec))), st_makepoint((traj
.ra0 + (date_part('epoch'::text, (run.tmax - traj.t0)) * traj.delta_ra)), (traj.dec0 + (date_part('epoch'::text, (run.tmax - traj.t0)) * traj.delta_dec))))))
                   Rows Removed by Join Filter: 140169
                   ->  Seq Scan on trajectory traj  (cost=0.00..10.00 rows=500 width=48) (actual time=0.007..0.563 rows=500 loops=1)
                   ->  Materialize  (cost=0.00..11.47 rows=298 width=136) (actual time=0.000..0.098 rows=298 loops=500)
                         ->  Seq Scan on run  (cost=0.00..9.98 rows=298 width=136) (actual time=0.004..0.096 rows=298 loops=1)
           ->  CTE Scan on tr  (cost=0.00..1.00 rows=50 width=80) (actual time=0.144..392.608 rows=8831 loops=1)
           ->  Bitmap Heap Scan on image im  (cost=10.00..751.10 rows=205 width=104) (actual time=0.169..0.174 rows=2 loops=8831)
                 Recheck Cond: (bbox3d &&& tr.tline)
                 ->  Bitmap Index Scan on image_bbox3d_idx  (cost=0.00..9.95 rows=205 width=0) (actual time=0.161..0.161 rows=2 loops=8831)
                       Index Cond: (bbox3d &&& tr.tline)
 Total runtime: 2017.684 ms
(16 rows)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants