Aspen Coyle

afcoyle@uw.edu

2021-08-18

Roberts Lab, UW-SAFS

In script 8_4_manual_clustering_cbaiv4.0_immune_genes.Rmd, we took a subset of immune genes that were aligned to cbai_transcriptomev4.0, took the counts, grouped them according to crab (e.g. took all libraries for Crab A, B, C...) and clustered gene expression into modules based on expression patterns.

Note that in this script, we have two additional crab - Crab D and F - than in scripts 7_4 and 7_5. This is because crabs D and F were uninfected, and therefore it only made sense to align them to a _C. bairdi_ -only library.

We then described the expression patterns of each module as following one of five patterns. Crabs with three time points (ambient- and lowered-temperature treatment crab) had the following notation used:

- High to low (HTL): Expression decreases over time (regardless of whether the decrease took place on Day 2 or Day 17)

- Low to high (LTH): Expression increases over time (regardless of whether the increase took place on Day 2 or Day 17)

- Low High Low (LHL): Expression increases on Day 2, and then drops on Day 17

- High Low High (HLH): Expression drops on Day 2 and then increases on Day 17

- Mixed (MIX): Expression within the module follows no clear pattern

Crabs in the Elevated-temperature treatment group had only two time points (crabs G, H, and I). For these, a different notation was used. 

- LL = expression stays low

- HH = expression stays high

- LH = expression goes from low to high

- HL = expression goes from high to low

- MIX = mixed - no clear pattern of expression within the module

Importantly, **multiple modules within a single crab could be given the same assignment**. This issue is what this script is meant to solve by merging gene lists.

First, let's see an example of one crab

In [2]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/

cluster_HTL.txt		 cluster_LHL.txt	  heatmap.png
cluster_HTL_heatmap.png  cluster_LHL_heatmap.png


And let's also see what each cluster looks like

In [3]:
!head ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/cluster_HTL.txt

"178"	"359"	"463"
"TRINITY_DN710_c0_g1_i1"	7.55246	0.100299	0
"TRINITY_DN25470_c0_g1_i1"	59.6977	0	0.119411
"TRINITY_DN859_c0_g1_i2"	1.92367	1.19377	0
"TRINITY_DN32773_c0_g1_i1"	128.025	101.267	0
"TRINITY_DN3560_c0_g1_i2"	0.860629	0.078715	0
"TRINITY_DN38617_c1_g1_i1"	66.0655	61.2506	0.0632432
"TRINITY_DN3050_c0_g1_i1"	0.537416	0	0


Looks like we need to remove the first line of each file - otherwise, when we merge modules, the header line will be included. And since columns correspond to days 0, 2, and 17 samples, it's not too meaningful

Now, let's see how many crab folders we have

In [4]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/

Crab_A	Crab_B	Crab_C	Crab_D	Crab_E	Crab_F	Crab_G	Crab_H	Crab_I


Looks good! We can move on.

## Crab A

We'll now start on merging all modules for Crab A

Let's take another look at the current modules for Crab A

In [5]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/

cluster_HTL.txt		 cluster_LHL.txt	  heatmap.png
cluster_HTL_heatmap.png  cluster_LHL_heatmap.png


In [6]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/merged_modules/HTL_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/merged_modules/LHL_merged.txt

# Won't merge LTH, HLH, or MIX modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [7]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/cluster_*txt

  8 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/cluster_HTL.txt
 11 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/cluster_LHL.txt
 19 total


In [8]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/merged_modules/*merged.txt

  7 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/merged_modules/HTL_merged.txt
 10 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/merged_modules/LHL_merged.txt
 17 total


Looks good! We can move on.

## Crab B

In [9]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/

cluster_HLH.txt		 cluster_HTL_heatmap.png  cluster_LTH.txt
cluster_HLH_heatmap.png  cluster_LHL.txt	  cluster_LTH_heatmap.png
cluster_HTL.txt		 cluster_LHL_heatmap.png  heatmap.png


In [10]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/HTL_merged.txt

# Merge all LTH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B -maxdepth 1 -name cluster_LTH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/LTH_merged.txt

# Merge all HLH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B -maxdepth 1 -name cluster_HLH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/HLH_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/LHL_merged.txt

# Won't merge MIX modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [11]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/cluster_*txt

  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/cluster_HLH.txt
  5 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/cluster_HTL.txt
  5 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/cluster_LHL.txt
  9 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/cluster_LTH.txt
 22 total


In [12]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/*merged.txt

  2 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/HLH_merged.txt
  4 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/HTL_merged.txt
  4 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/LHL_merged.txt
  8 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/LTH_merged.txt
 18 total


Looks good! We can move on.

## Crab C

In [13]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/

cluster_HLH.txt		 cluster_LHL.txt	  cluster_LTH2_heatmap.png
cluster_HLH_heatmap.png  cluster_LHL_heatmap.png  cluster_LTH_heatmap.png
cluster_HTL.txt		 cluster_LTH.txt	  heatmap.png
cluster_HTL_heatmap.png  cluster_LTH2.txt


In [14]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/HTL_merged.txt

# Merge all LTH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C -maxdepth 1 -name cluster_LTH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/LTH_merged.txt

# Merge all HLH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C -maxdepth 1 -name cluster_HLH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/HLH_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/LHL_merged.txt

# Won't merge MIX modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [15]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/cluster_*txt

  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/cluster_HLH.txt
  5 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/cluster_HTL.txt
  4 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/cluster_LHL.txt
  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/cluster_LTH.txt
  8 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/cluster_LTH2.txt
 23 total


In [16]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/*merged.txt

  2 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/HLH_merged.txt
  4 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/HTL_merged.txt
  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/LHL_merged.txt
  9 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/LTH_merged.txt
 18 total


Looks good! We can move on.

## Crab D

In [17]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/

cluster_HTL.txt		 cluster_LHL_heatmap.png  heatmap.png
cluster_HTL_heatmap.png  cluster_LTH.txt
cluster_LHL.txt		 cluster_LTH_heatmap.png


In [18]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/merged_modules/HTL_merged.txt

# Merge all LTH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D -maxdepth 1 -name cluster_LTH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/merged_modules/LTH_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/merged_modules/LHL_merged.txt

# Won't merge HLH or MIX modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [19]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/cluster_*txt

  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/cluster_HTL.txt
  9 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/cluster_LHL.txt
  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/cluster_LTH.txt
 15 total


In [20]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/merged_modules/*merged.txt

  2 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/merged_modules/HTL_merged.txt
  8 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/merged_modules/LHL_merged.txt
  2 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_D/merged_modules/LTH_merged.txt
 12 total


Looks good! We can move on.

## Crab E

In [21]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/

cluster_HTL.txt		 cluster_LHL_heatmap.png  heatmap.png
cluster_HTL_heatmap.png  cluster_LTH.txt
cluster_LHL.txt		 cluster_LTH_heatmap.png


In [22]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/merged_modules

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/merged_modules/HTL_merged.txt

# Merge all LTH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E -maxdepth 1 -name cluster_LTH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/merged_modules/LTH_merged.txt

# Merge all LHL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E -maxdepth 1 -name cluster_LHL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/merged_modules/LHL_merged.txt

# Won't merge HLH or MIX modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [23]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/cluster_*txt

 13 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/cluster_HTL.txt
  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/cluster_LHL.txt
  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/cluster_LTH.txt
 19 total


In [24]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/merged_modules/*merged.txt

 12 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/merged_modules/HTL_merged.txt
  2 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/merged_modules/LHL_merged.txt
  2 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_E/merged_modules/LTH_merged.txt
 16 total


Looks good! We can move on.

## Crab F

In [25]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F/

cluster_HLH.txt		 cluster_HTL.txt	  heatmap.png
cluster_HLH_heatmap.png  cluster_HTL_heatmap.png


In [26]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F/merged_modules

# Merge all HLH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F -maxdepth 1 -name cluster_HLH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F/merged_modules/HLH_merged.txt

# Merge all HTL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F -maxdepth 1 -name cluster_HTL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F/merged_modules/HTL_merged.txt

# Won't merge LTH, LHL, or MIX modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [27]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F/cluster_*txt

  4 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F/cluster_HLH.txt
 14 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F/cluster_HTL.txt
 18 total


In [28]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F/merged_modules/*merged.txt

  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F/merged_modules/HLH_merged.txt
 13 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_F/merged_modules/HTL_merged.txt
 16 total


Looks good! We can move on.

## Crab G

In [29]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/

cluster_HL.txt		cluster_MIX.txt   cluster_MIX2_heatmap.png  heatmap.png
cluster_HL_heatmap.png	cluster_MIX2.txt  cluster_MIX_heatmap.png


In [30]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/merged_modules

# Merge all HL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G -maxdepth 1 -name cluster_HL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/merged_modules/HL_merged.txt

# Merge all MIX modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G -maxdepth 1 -name cluster_MIX*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/merged_modules/MIX_merged.txt

# Won't merge HH, LH, or LL modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [31]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/cluster_*txt

  7 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/cluster_HL.txt
  6 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/cluster_MIX.txt
  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/cluster_MIX2.txt
 16 total


In [32]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/merged_modules/*merged.txt

  6 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/merged_modules/HL_merged.txt
  7 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_G/merged_modules/MIX_merged.txt
 13 total


Looks good! We can move on.

## Crab H

In [33]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/

cluster_HL.txt		cluster_LH2_heatmap.png  cluster_MIX.txt
cluster_HL_heatmap.png	cluster_LH3.txt		 cluster_MIX_heatmap.png
cluster_LH.txt		cluster_LH3_heatmap.png  heatmap.png
cluster_LH2.txt		cluster_LH_heatmap.png


In [34]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/merged_modules

# Merge all HL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H -maxdepth 1 -name cluster_HL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/merged_modules/HL_merged.txt

# Merge all LH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H -maxdepth 1 -name cluster_LH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/merged_modules/LH_merged.txt

# Merge all MIX modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H -maxdepth 1 -name cluster_MIX*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/merged_modules/MIX_merged.txt

# Won't merge LL or HH modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [35]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/cluster_*txt

  4 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/cluster_HL.txt
  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/cluster_LH.txt
  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/cluster_LH2.txt
  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/cluster_LH3.txt
  7 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/cluster_MIX.txt
 20 total


In [36]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/merged_modules/*merged.txt

  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/merged_modules/HL_merged.txt
  6 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/merged_modules/LH_merged.txt
  6 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_H/merged_modules/MIX_merged.txt
 15 total


Looks good! We can move on.

## Crab I

In [37]:
!ls ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/

cluster_HL.txt		 cluster_HL3.txt	  cluster_LH.txt
cluster_HL2.txt		 cluster_HL3_heatmap.png  cluster_LH_heatmap.png
cluster_HL2_heatmap.png  cluster_HL_heatmap.png   heatmap.png


In [38]:
# Make new directory for merged modules
!mkdir ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/merged_modules

# Merge all HL modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I -maxdepth 1 -name cluster_HL*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/merged_modules/HL_merged.txt

# Merge all LH modules
!find ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I -maxdepth 1 -name cluster_LH*txt | xargs -n 1 tail -n +2 > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/merged_modules/LH_merged.txt

# Won't merge HH, MIX, or LL modules, as none are present in this crab

Check we did this right by examining number of lines. There will be slightly fewer in merged_modules, as we removed headers

In [39]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/cluster_*txt

  4 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/cluster_HL.txt
  7 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/cluster_HL2.txt
  6 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/cluster_HL3.txt
  3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/cluster_LH.txt
 20 total


In [40]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/merged_modules/*merged.txt

 14 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/merged_modules/HL_merged.txt
  2 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_I/merged_modules/LH_merged.txt
 16 total


Looks good! We can move on.

## Done merging

Now, let's get a count of the number of lines in each module in each crab

## Line Counts of Modules

In [41]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_*/merged_modules/*merged.txt

   7 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/merged_modules/HTL_merged.txt
  10 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_A/merged_modules/LHL_merged.txt
   2 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/HLH_merged.txt
   4 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/HTL_merged.txt
   4 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/LHL_merged.txt
   8 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_B/merged_modules/LTH_merged.txt
   2 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/HLH_merged.txt
   4 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/HTL_merged.txt
   3 ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_C/merged_modules/LHL_merged.txt
   9 ../output/manual_clustering/cbai_transcri

We'll now write the above word counts to a file, which we'll then turn into a table using R

In [42]:
!wc -l ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/Crab_*/merged_modules/*merged.txt > ../output/manual_clustering/cbai_transcriptomev4.0/immune_genes/merged_modules_raw_counts.txt