Aidan Coyle

afcoyle@uw.edu

2021-07-01

Roberts Lab, UW-SAFS

In script 7_1_manual_clustering_cbaiv4.0.Rmd, we took libraries aligned to a transcriptome filtered to only include presumed _Chionoecetes bairdi_ genes, grouped them according to crab (e.g. took all libraries for Crab A, B, C...) and clustered gene expression into modules based on expression patterns

Note that in this script, we have two additional crab - Crab D and F - than in scripts 7_4 and 7_5. This is because crabs D and F were uninfected, and therefore it only made sense to align them to a _C. bairdi_ -only library.

We then described the expression patterns of each module as following one of five patterns. Crabs with three time points (ambient- and lowered-temperature treatment crab) had the following notation used:

- High to low (HTL): Expression decreases over time (regardless of whether the decrease took place on Day 2 or Day 17)

- Low to high (LTH): Expression increases over time (regardless of whether the increase took place on Day 2 or Day 17)

- Low High Low (LHL): Expression increases on Day 2, and then drops on Day 17

- High Low High (HLH): Expression drops on Day 2 and then increases on Day 17

- Mixed (MIX): Expression within the module follows no clear pattern

Crabs in the Elevated-temperature treatment group had only two time points (crabs G, H, and I). For these, a different notation was used. 

- LL = expression stays low

- HH = expression stays high

- LH = expression goes from low to high

- HL = expression goes from high to low

- MIX = mixed - no clear pattern of expression within the module

Importantly, **multiple modules within a single crab could be given the same assignment**. This issue is what this script is meant to solve by merging gene lists.

First, let's see an example of one crab

In [3]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/

cluster_HLH.txt		  cluster_LHL.txt	    cluster_LTH3_heatmap.png
cluster_HLH_heatmap.png   cluster_LHL_heatmap.png   cluster_LTH_heatmap.png
cluster_HTL.txt		  cluster_LTH.txt	    heatmap.png
cluster_HTL2.txt	  cluster_LTH2.txt	    manual_clustnums
cluster_HTL2_heatmap.png  cluster_LTH2_heatmap.png
cluster_HTL_heatmap.png   cluster_LTH3.txt


And let's also see what each cluster looks like

In [4]:
!head ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/cluster_HLH.txt

"id178_TPM"	"id359_TPM"	"id463_TPM"
"TRINITY_DN0_c2_g1_i16"	17.5085	14.8448	16.3822
"TRINITY_DN2974_c0_g2_i1"	30.3466	14.5954	23.4665
"TRINITY_DN2994_c0_g1_i14"	3.86961	0.28522	2.96975
"TRINITY_DN468_c0_g1_i10"	2.44991	1.27697	2.14369
"TRINITY_DN1192_c2_g1_i4"	2.98742	0.60322	2.73096
"TRINITY_DN16297_c1_g1_i1"	2.88039	0.578571	1.87961
"TRINITY_DN16312_c0_g1_i1"	3.28456	1.27841	2.32001
"TRINITY_DN15552_c0_g1_i1"	3.114	1.39303	2.36459
"TRINITY_DN8275_c0_g1_i1"	2.94605	0.542647	2.84651


Looks like we need to remove the first line of each file - otherwise, when we merge modules, the header line will be included. And since columns correspond to days 0, 2, and 17 samples, it's not too meaningful

Now, let's see how many crab folders we have

In [5]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/

Crab_A	Crab_B	Crab_C	Crab_D	Crab_E	Crab_F	Crab_G	Crab_H	Crab_I


Looks good! We can move on.

## Crab A

We'll now start on merging all modules for Crab A

Let's take another look at the current modules for Crab A

In [6]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/

cluster_HLH.txt		  cluster_LHL.txt	    cluster_LTH3_heatmap.png
cluster_HLH_heatmap.png   cluster_LHL_heatmap.png   cluster_LTH_heatmap.png
cluster_HTL.txt		  cluster_LTH.txt	    heatmap.png
cluster_HTL2.txt	  cluster_LTH2.txt	    manual_clustnums
cluster_HTL2_heatmap.png  cluster_LTH2_heatmap.png
cluster_HTL_heatmap.png   cluster_LTH3.txt


In [7]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/HTL_merged.txt

# Merge all LTH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A -maxdepth 1 -name cluster_LTH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/LTH_merged.txt

# Merge all HLH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A -maxdepth 1 -name cluster_HLH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/HLH_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/LHL_merged.txt

# Won't merge MIX modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [8]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/cluster_*txt

    49 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/cluster_HLH.txt
  3666 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/cluster_HTL.txt
   602 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/cluster_HTL2.txt
  4583 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/cluster_LHL.txt
  1335 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/cluster_LTH.txt
   672 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/cluster_LTH2.txt
  1289 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/cluster_LTH3.txt
 12196 total


In [9]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/*merged.txt

    48 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/HLH_merged.txt
  4266 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/HTL_merged.txt
  4582 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/LHL_merged.txt
  3293 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/LTH_merged.txt
 12189 total


Looks good! We can move on.

## Crab B

In [10]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/

cluster_HLH.txt		  cluster_LHL.txt	    cluster_LTH2_heatmap.png
cluster_HLH_heatmap.png   cluster_LHL2.txt	    cluster_LTH_heatmap.png
cluster_HTL.txt		  cluster_LHL2_heatmap.png  heatmap.png
cluster_HTL2.txt	  cluster_LHL_heatmap.png   manual_clustnums
cluster_HTL2_heatmap.png  cluster_LTH.txt
cluster_HTL_heatmap.png   cluster_LTH2.txt


In [11]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/HTL_merged.txt

# Merge all LTH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B -maxdepth 1 -name cluster_LTH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/LTH_merged.txt

# Merge all HLH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B -maxdepth 1 -name cluster_HLH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/HLH_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/LHL_merged.txt

# Won't merge MIX modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [12]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/cluster_*txt

   925 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/cluster_HLH.txt
  1506 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/cluster_HTL.txt
   348 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/cluster_HTL2.txt
  8280 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/cluster_LHL.txt
  1250 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/cluster_LHL2.txt
  3092 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/cluster_LTH.txt
  2627 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/cluster_LTH2.txt
 18028 total


In [13]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/*merged.txt

   924 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/HLH_merged.txt
  1852 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/HTL_merged.txt
  9528 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/LHL_merged.txt
  5717 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/LTH_merged.txt
 18021 total


Looks good! We can move on.

## Crab C

In [14]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/

cluster_HLH.txt		  cluster_HTL3_heatmap.png  cluster_LTH2_heatmap.png
cluster_HLH_heatmap.png   cluster_HTL_heatmap.png   cluster_LTH_heatmap.png
cluster_HTL.txt		  cluster_LHL.txt	    heatmap.png
cluster_HTL2.txt	  cluster_LHL_heatmap.png   manual_clustnums
cluster_HTL2_heatmap.png  cluster_LTH.txt
cluster_HTL3.txt	  cluster_LTH2.txt


In [15]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/HTL_merged.txt

# Merge all LTH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C -maxdepth 1 -name cluster_LTH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/LTH_merged.txt

# Merge all HLH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C -maxdepth 1 -name cluster_HLH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/HLH_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/LHL_merged.txt

# Won't merge MIX modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [16]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/cluster_*txt

  1998 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/cluster_HLH.txt
  1730 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/cluster_HTL.txt
   898 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/cluster_HTL2.txt
  1395 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/cluster_HTL3.txt
   320 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/cluster_LHL.txt
  3840 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/cluster_LTH.txt
   363 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/cluster_LTH2.txt
 10544 total


In [17]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/*merged.txt

  1997 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/HLH_merged.txt
  4020 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/HTL_merged.txt
   319 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/LHL_merged.txt
  4201 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/LTH_merged.txt
 10537 total


Looks good! We can move on.

## Crab D

In [18]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/

cluster_HLH.txt		 cluster_LHL_heatmap.png   cluster_MIX.txt
cluster_HLH_heatmap.png  cluster_LTH.txt	   cluster_MIX_heatmap.png
cluster_HTL.txt		 cluster_LTH2.txt	   heatmap.png
cluster_HTL_heatmap.png  cluster_LTH2_heatmap.png  manual_clustnums
cluster_LHL.txt		 cluster_LTH_heatmap.png


In [19]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/HTL_merged.txt

# Merge all LTH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D -maxdepth 1 -name cluster_LTH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/LTH_merged.txt

# Merge all HLH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D -maxdepth 1 -name cluster_HLH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/HLH_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/LHL_merged.txt

# Merge MIX modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D -maxdepth 1 -name cluster_MIX*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/MIX_merged.txt


Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [20]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_*txt

  1082 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_HLH.txt
   428 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_HTL.txt
  4166 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_LHL.txt
  7279 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_LTH.txt
   604 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_LTH2.txt
   697 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_MIX.txt
 14256 total


In [21]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/*merged.txt

  1081 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/HLH_merged.txt
   427 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/HTL_merged.txt
  4165 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/LHL_merged.txt
  7881 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/LTH_merged.txt
   696 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/MIX_merged.txt
 14250 total


Looks good! We can move on.

## Crab E

In [22]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/

cluster_HTL.txt		  cluster_LHL_heatmap.png   cluster_LTH3_heatmap.png
cluster_HTL2.txt	  cluster_LTH.txt	    cluster_LTH_heatmap.png
cluster_HTL2_heatmap.png  cluster_LTH2.txt	    heatmap.png
cluster_HTL_heatmap.png   cluster_LTH2_heatmap.png  manual_clustnums
cluster_LHL.txt		  cluster_LTH3.txt


In [23]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/merged_modules/HTL_merged.txt

# Merge all LTH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E -maxdepth 1 -name cluster_LTH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/merged_modules/LTH_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/merged_modules/LHL_merged.txt

# Won't merge HLH and MIX modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [24]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/cluster_*txt

  3937 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/cluster_HTL.txt
  3392 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/cluster_HTL2.txt
  3567 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/cluster_LHL.txt
  3560 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/cluster_LTH.txt
   716 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/cluster_LTH2.txt
   580 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/cluster_LTH3.txt
 15752 total


In [25]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/merged_modules/*merged.txt

  7327 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/merged_modules/HTL_merged.txt
  3566 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/merged_modules/LHL_merged.txt
  4853 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_E/merged_modules/LTH_merged.txt
 15746 total


Looks good! We can move on.

## Crab F

In [26]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/Crab_F/

cluster_HTL.txt		  cluster_LHL_heatmap.png   cluster_MIX.txt
cluster_HTL2.txt	  cluster_LTH.txt	    cluster_MIX_heatmap.png
cluster_HTL2_heatmap.png  cluster_LTH2.txt	    heatmap.png
cluster_HTL_heatmap.png   cluster_LTH2_heatmap.png  manual_clustnums
cluster_LHL.txt		  cluster_LTH_heatmap.png


In [27]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/Crab_F/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_F -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_F/merged_modules/HTL_merged.txt

# Merge all LTH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_F -maxdepth 1 -name cluster_LTH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_F/merged_modules/LTH_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_F -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_F/merged_modules/LHL_merged.txt

# Merge MIX modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_F -maxdepth 1 -name cluster_MIX*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_F/merged_modules/MIX_merged.txt

# Won't merge HLH modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [28]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_*txt

  1082 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_HLH.txt
   428 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_HTL.txt
  4166 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_LHL.txt
  7279 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_LTH.txt
   604 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_LTH2.txt
   697 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/cluster_MIX.txt
 14256 total


In [29]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/*merged.txt

  1081 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/HLH_merged.txt
   427 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/HTL_merged.txt
  4165 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/LHL_merged.txt
  7881 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/LTH_merged.txt
   696 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_D/merged_modules/MIX_merged.txt
 14250 total


Looks good! We can move on.

## Crab G

In [30]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/

cluster_HL.txt		 cluster_LH2_heatmap.png   cluster_MIX3_heatmap.png
cluster_HL2.txt		 cluster_LH_heatmap.png    cluster_MIX4.txt
cluster_HL2_heatmap.png  cluster_MIX.txt	   cluster_MIX4_heatmap.png
cluster_HL_heatmap.png	 cluster_MIX2.txt	   cluster_MIX_heatmap.png
cluster_LH.txt		 cluster_MIX2_heatmap.png  heatmap.png
cluster_LH2.txt		 cluster_MIX3.txt	   manual_clustnums


In [31]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/merged_modules

# Merge all HL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G -maxdepth 1 -name cluster_HL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/merged_modules/HL_merged.txt

# Merge all LH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G -maxdepth 1 -name cluster_LH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/merged_modules/LH_merged.txt

# Merge all MIX modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G -maxdepth 1 -name cluster_MIX*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/merged_modules/MIX_merged.txt

# Won't merge HH or LL modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [32]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/cluster_*txt

  1783 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/cluster_HL.txt
    54 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/cluster_HL2.txt
  2003 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/cluster_LH.txt
     9 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/cluster_LH2.txt
  9885 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/cluster_MIX.txt
   180 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/cluster_MIX2.txt
  1238 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/cluster_MIX3.txt
    37 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/cluster_MIX4.txt
 15189 total


In [33]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/merged_modules/*merged.txt

  1835 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/merged_modules/HL_merged.txt
  2010 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/merged_modules/LH_merged.txt
 11336 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_G/merged_modules/MIX_merged.txt
 15181 total


Looks good! We can move on.

## Crab H

In [34]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/

cluster_HL.txt		cluster_MIX.txt		  cluster_MIX3_heatmap.png
cluster_HL_heatmap.png	cluster_MIX2.txt	  cluster_MIX_heatmap.png
cluster_LH.txt		cluster_MIX2_heatmap.png  heatmap.png
cluster_LH_heatmap.png	cluster_MIX3.txt	  manual_clustnums


In [35]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/merged_modules

# Merge all HL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H -maxdepth 1 -name cluster_HL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/merged_modules/HL_merged.txt

# Merge all LH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H -maxdepth 1 -name cluster_LH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/merged_modules/LH_merged.txt

# Merge all MIX modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H -maxdepth 1 -name cluster_MIX*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/merged_modules/MIX_merged.txt

# Won't merge LL or HHmodules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [36]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/cluster_*txt

  2242 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/cluster_HL.txt
   155 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/cluster_LH.txt
  7821 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/cluster_MIX.txt
   340 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/cluster_MIX2.txt
    49 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/cluster_MIX3.txt
 10607 total


In [37]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/merged_modules/*merged.txt

  2241 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/merged_modules/HL_merged.txt
   154 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/merged_modules/LH_merged.txt
  8207 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_H/merged_modules/MIX_merged.txt
 10602 total


Looks good! We can move on.

## Crab I

In [39]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/

cluster_HL.txt		cluster_LH2_heatmap.png  cluster_MIX.txt
cluster_HL_heatmap.png	cluster_LH_heatmap.png	 cluster_MIX_heatmap.png
cluster_LH.txt		cluster_LL.txt		 heatmap.png
cluster_LH2.txt		cluster_LL_heatmap.png	 manual_clustnums


In [40]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/merged_modules

# Merge all HL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I -maxdepth 1 -name cluster_HL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/merged_modules/HL_merged.txt

# Merge all LH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I -maxdepth 1 -name cluster_LH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/merged_modules/LH_merged.txt

# Merge all LL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I -maxdepth 1 -name cluster_LL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/merged_modules/LL_merged.txt

# Merge all MIX modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I -maxdepth 1 -name cluster_MIX*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/merged_modules/MIX_merged.txt

# Won't merge HH modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [41]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/cluster_*txt

   305 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/cluster_HL.txt
  4346 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/cluster_LH.txt
   271 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/cluster_LH2.txt
  7692 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/cluster_LL.txt
    75 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/cluster_MIX.txt
 12689 total


In [42]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/merged_modules/*merged.txt

   304 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/merged_modules/HL_merged.txt
  4615 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/merged_modules/LH_merged.txt
  7691 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/merged_modules/LL_merged.txt
    74 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_I/merged_modules/MIX_merged.txt
 12684 total


Looks good! We can move on.

## Done merging

Now, let's get a count of the number of lines in each module in each crab

## Line Counts of Modules

In [43]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_*/merged_modules/*merged.txt

     48 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/HLH_merged.txt
   4266 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/HTL_merged.txt
   4582 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/LHL_merged.txt
   3293 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_A/merged_modules/LTH_merged.txt
    924 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/HLH_merged.txt
   1852 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/HTL_merged.txt
   9528 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/LHL_merged.txt
   5717 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_B/merged_modules/LTH_merged.txt
   1997 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/HLH_merged.txt
   4020 ../output/manual_clustering/cbai_transcriptomev4.0/Crab_C/merged_modules/HTL_merged.txt
    319 ../output/manual_clustering/cbai

We'll now write the above word counts to a file, which we'll then turn into a table using R

In [1]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/Crab_*/merged_modules/*merged.txt > ../output/manual_clustering/cbai_transcriptomev4.0/merged_modules_raw_counts.txt