## Test case LVV-T2724: 

### Specification: 
This test covers two requirements relating to high-volume queries on Qserv

#### Requirement: DMS-REQ-0357  (Priority: 1b): Result latency for high-volume full-sky queries on the Object table <br>
Specification: High-volume queries on the Object table -- queries that involve full-sky scans -- shall be answered in hvObjectQueryTime.
hvObjectQueryTime: Maximum time allowed for retrieving results of a high-volume query of the Object table == 60 minutes

#### Requirements ID: DMS-REQ-0361 (Priority: 1b)L Simultaneous users for high-volume queries
Specification: The system shall support hvQueryUsers simultaneous high-volume queries running at any given time
hvQueryUsers: Minimum number of simultaneous users performing high volume queries == 50 

### Discussion:
This is a query on Qserv, not the Butler
DP1 full scan is teh best we can do 2025-07-22
Eventually verify latency against a full scale, e.g DR1 sized Object catalog

From ID: DMS-REQ-0356 (Priority: 1b) - a high-volume query shoudl return a result set > 0.5 GB
Note: queryResult table result_4430668 is too large at 3203320192 bytes, max allowed size is 3145728000 bytes
Mobu will simulate the hvQueryUsers. The script will compare the query time to the spec


In [1]:
from lsst.rsp import get_tap_service
import time

In [2]:
# Requirements specs

# Maximum time allowed for retrieving results of a high-volume query of the Object table (hour)
hvObjectQueryTime = 1

# Minimum number of simultaneous users performing high volume queries. (mobu)
hvQueryUsers = 50 

In [3]:
service = get_tap_service("tap")
assert service is not None

In [4]:
# Calculate size in gigabytes (GB)
def get_size_in_gb(df): 
    size = df.memory_usage(deep=True).sum()
    size /= 1024 ** 3
    return size

In [5]:
# High-volume query 
query = """SELECT *, 
scisql_nanojanskyToAbMag(r_cModelFlux) as rmag 
FROM dp1.Object
WHERE scisql_nanojanskyToAbMag(r_cModelFlux) < 24
"""
print(query)

SELECT *, 
scisql_nanojanskyToAbMag(r_cModelFlux) as rmag 
FROM dp1.Object
WHERE scisql_nanojanskyToAbMag(r_cModelFlux) < 24



In [6]:
job = service.submit_job(query)

In [None]:
# t]Benchmark  the part from job run start to successfully returning the result set
start_time = time.time()
job.run()

job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)

if job.phase == 'ERROR':
    job.raise_if_error()
assert job.phase == 'COMPLETED'

results = job.fetch_result()
end_time = time.time()

In [None]:
# Run time
execution_time = end_time - start_time
print(f"Execution time: {execution_time}")

In [None]:
# Result set size
df = results.to_table().to_pandas()
size = get_size_in_gb(df)
print(f"DataFrame size: {size:.2f} GB")
print(df.info(memory_usage='deep'))

In [None]:
job.delete()
del query, results

In [None]:
# Check we are withing spec 
# assert time_query <= hvObjectQueryTime