Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Inconsistant table in pbta-gene-counts-rsem-expected_count.stranded.rds #369

Closed
jashapiro opened this issue Dec 23, 2019 · 5 comments
Closed
Labels

Comments

@jashapiro
Copy link
Member

jashapiro commented Dec 23, 2019

What data file(s) does this issue pertain to?

pbta-gene-counts-rsem-expected_count.stranded.rds
less so:
pbta-gene-expression-rsem-tpm.stranded.rds

What release are you using?

v12

Put your question or report your issue here.

The count table in pbta-gene-counts-rsem-expected_count.stranded.rds has an extra column at the start of the data table, labeled X which contains only 0 indexed row numbers. This does not appear in
pbta-gene-counts-rsem-expected_count.polyA.rds This column should presumably be removed in future versions.

Additionally, the gene_id column in pbta-gene-counts-rsem-expected_count.stranded.rds and pbta-gene-expression-rsem-tpm.stranded.rds is stored as factor, not as character, in contrast to all other files.

@jashapiro jashapiro added the data label Dec 23, 2019
@jharenza
Copy link
Collaborator

cc-ing @tkoganti, who can fix this in V13

@jharenza jharenza mentioned this issue Dec 23, 2019
7 tasks
@tkoganti
Copy link
Collaborator

tkoganti commented Jan 6, 2020

Fixed pbta-gene-counts-rsem-expected_count.stranded.rds file and uploaded file to V13-data

@tkoganti
Copy link
Collaborator

tkoganti commented Jan 9, 2020

Both the files are uploaded under V13-data here - https://cavatica.sbgenomics.com/u/cavatica/pbta/files/#q?path=processed-data-merge%2FV13-data

@tkoganti
Copy link
Collaborator

tkoganti commented Jan 14, 2020

@jharenza I checked that both the files have samples within the pbta-histologies file.
pbta-gene-expression-rsem-tpm.stranded.rds was okay
pbta-gene-counts-rsem-expected_count.stranded.rds file had one extra sample BS_4X8PQ5G6 that I deleted and uploaded the updated file here - https://cavatica.sbgenomics.com/u/cavatica/pbta/files/5e1d165fe4b09d9aaf49571b/

@jaclyn-taroni
Copy link
Member

Looking at pbta-gene-counts-rsem-expected_count.stranded.rds in v13:

> `pbta-gene-counts-rsem-expected_count.stranded`[1:5, 1:5]
                      gene_id BS_014EVM2D BS_02NZT8CE BS_03FT4S8B
1   ENSG00000000003.14_TSPAN6        2273        1531        5005
2      ENSG00000000005.5_TNMD           3          10          10
3     ENSG00000000419.12_DPM1         674         555        2282
4    ENSG00000000457.13_SCYL3         771         758        1962
5 ENSG00000000460.16_C1orf112         360         522         677
  BS_0448A413
1        1317
2          12
3         569
4         534
5         355


> head(sapply(`pbta-gene-counts-rsem-expected_count.stranded`, class))
    gene_id BS_014EVM2D BS_02NZT8CE BS_03FT4S8B BS_0448A413 BS_044XZ8ST 
"character"   "numeric"   "numeric"   "numeric"   "numeric"   "numeric" 

and pbta-gene-expression-rsem-tpm.stranded.rds in v13:

> head(sapply(`pbta-gene-expression-rsem-tpm.stranded`, class))
    gene_id BS_014EVM2D BS_02NZT8CE BS_03FT4S8B BS_0448A413 BS_044XZ8ST 
"character"   "numeric"   "numeric"   "numeric"   "numeric"   "numeric" 

Closing this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants