Count how many chromosome in the range of the kmer #1249

Fei-Guang opened this Issue Nov 8, 2016 · 1 comment


None yet
1 participant

Fei-Guang commented Nov 8, 2016

Hello, Team

i need the help on count the how many chromosome in the range of the kmer

val reads = sc.loadAlignments("/data/sample.rmdup.bam")

val rdd1 = reads.rdd.flatMap(read => {
  // check whether the read is mapped, lest we get a null pointer exception
  if (read.getReadMapped) {
    Some((read.getContigName, read.getStart))
  } else {
val rdd2= sc.textFile("/data/win_100k.use_50mer")
  .map(line => {
    // get the range from the rdd2.kmer file
    val columns = line.split("\\s+") // i assume this is tab delimited?
    val contig = columns(0)
    val start = columns(4).toLong
    val end = columns.last.toLong
    (contig, (start, end))



how to get the following count :

if rdd1[.(1)] == rdd2[.(1)] && rdd1[.(2)] in range of [rdd2[.(2)],lines[_.(3)] then 
count[rdd2(chr1,(10001,20000))] plus 1

the example result:

chr1,(10001,20000), 4
chr1,(30001,40000), 1
chr2,(110001,260000), 2

@Fei-Guang Fei-Guang closed this Nov 8, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment