Skip to content

Can CLR be applied after normalizing the pfam abundances with RPKG (reads per kilobase of genome)? #44

@alvsanchezmaria

Description

@alvsanchezmaria

I have an important doubt that no matter how much I read in the bibliography I cannot decipher it, it is about the issue of data normalization and transformation of abundances. I have a table of abundances of different genes annotated by pfam for 30 samples, which I have normalized by RPKG (reads per kilobase of genome). I have attached the metadata and used FlashWeave with the default values, so it applies CLR normalization:

network=learn_network(abundance_table , metadata_table , sensitive=true , heterogeneous=false , transposed=true , n_obs_min = threshold)

Is it wrong to normalize 2 times? Indeed, When we do RPKG are we normalizing an ending the compositionality problem or is it also necessary to apply CLR? RPKG serves for intersample normalization and I understand that CLR makes a transformation to work with the compositionality of the data. Can CLR be applied automatically in Flashweave after normalizing the pfam abundances? Or is it better to run flashweave normalize=false?

network=learn_network(abundance_table , metadata_table , sensitive=true , heterogeneous=false , transposed=true , n_obs_min = threshold, normalize=false ).

I have computed both networks and I am comparing them. The general metrics do not seem to change much, but when it comes to analyzing the communities and the role of the nodes (classifying them into connectors, peripherals, module hubs and networks hubs) the interpretation changes. Any ideas about it?

Thanks in advance, Maria

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions