-
Notifications
You must be signed in to change notification settings - Fork 77
Open
Labels
Python APIIssue is about the Python APIIssue is about the Python APIenhancementNew feature or requestNew feature or request
Description
I frequently use msprime.RateMap
and often need to write these out to hapmap format, e.g. for input to a standalone tool. I noticed that tskit-dev/msprime#1338 discusses adding RateMap.write_hapmap
, but afaik this isn't in the works. Is there still interest in including a method along these lines? Something similar to,
def write_hapmap(ratemap: msprime.RateMap, contig_name: str) -> str:
"""
Write a recombination rate map into hapmap format.
"""
physical_position = ratemap.position.astype(np.int64)
scaled_rate = ratemap.rate * 1e8
map_position = ratemap.get_cumulative_mass(physical_position) * 100
hapmap = ["Chromosome\tPosition(bp)\tRate(cM/Mb)\tMap(cM)"]
if np.isnan(scaled_rate[-1]): # handle trailing NaN
scaled_rate[-1] = 0.0
else:
scaled_rate = np.append(scaled_rate, 0.0)
for rate, pos, map in zip(scaled_rate, physical_position, map_position):
if not np.isnan(rate):
hapmap.append(f"{contig_name}\t{pos}\t{rate:.10f}\t{map:.10f}")
hapmap = "\n".join(hapmap) + "\n"
return hapmap
# for tests
def assert_hapmap_equals_ratemap(ratemap: msprime.RateMap, hapmap_path: str, atol: float = 1e-12):
"""
Check that hapmap file matches rate map.
"""
hapmap_check = msprime.RateMap.read_hapmap(hapmap_path, sequence_length=ratemap.sequence_length, rate_col=2)
np.testing.assert_allclose(hapmap_check.rate, ratemap.rate, atol=atol)
np.testing.assert_allclose(hapmap_check.position, ratemap.position)
hapmap_check = msprime.RateMap.read_hapmap(hapmap_path, sequence_length=ratemap.sequence_length, map_col=3)
np.testing.assert_allclose(hapmap_check.rate, ratemap.rate, atol=atol)
np.testing.assert_allclose(hapmap_check.position, ratemap.position)
But perhaps with arguments controlling scaling of rates/map positions; precision of text output; and header -- so that non-recombination maps could be written out in meaningful units?
Metadata
Metadata
Assignees
Labels
Python APIIssue is about the Python APIIssue is about the Python APIenhancementNew feature or requestNew feature or request