Skip to content

Commit

Permalink
Merge pull request #9 from RenneLab/v3.2
Browse files Browse the repository at this point in the history
V3.2
  • Loading branch information
dstrib committed Aug 17, 2023
2 parents d7f792d + 0b4ae75 commit d4d33a2
Show file tree
Hide file tree
Showing 78 changed files with 16,595 additions and 60,133 deletions.
8 changes: 8 additions & 0 deletions docs/source/about.rst
Expand Up @@ -22,6 +22,13 @@ Lead Developer
Changelog
---------


* 0.3.2 (2023-08) Changes include:

* Misc Bugfixes and Refinements
* Add duplicate hybrid filtration (by HybRecord.id) options to hyb_filter
* Add duplicate hybrid filtration to example analyses

* 0.3.1 (2023-08) Changes include:

* Misc Bugfixes and Refinements
Expand All @@ -31,6 +38,7 @@ Changelog
* Change default plot colors to the Bang Wong scheme [Wong2011]_ for
colorblind accessibility
* Documentation corrections
* Spellcheck

* 0.3.0 (2023-04) Major Codebase And API Overhaul. Changes include:

Expand Down
2 changes: 1 addition & 1 deletion example_01_type_mirna_analysis/README.rst
Expand Up @@ -23,7 +23,7 @@ The data files can be downloaded and uncompressed by using the command::
The unpacked hyb data-files require ~2 GB of space.
The completed output of the analysis requires ~1.5 GB of space.

Summary Analysis Example Output
Type-miRNA Analysis Example Output
-------------------------------

.. image:: ../example_01_type_mirna_analysis/example_output/combined_analysis_types_hybrid_types.png
Expand Down
18 changes: 15 additions & 3 deletions example_01_type_mirna_analysis/analysis_python.py
Expand Up @@ -40,6 +40,11 @@
# Set hybrid segment types to remove as part of quality control (QC)
remove_types = ['rRNA', 'mitoch-rRNA']

# Initialize Combined Analysis
combined_analysis = hybkit.analysis.Analysis(
analysis_types=['type', 'mirna'], name='Combined Analysis'
)

# Iterate over each input file, find the segment types, and save the output
# in the output directory.
for in_file_path in input_files:
Expand All @@ -56,13 +61,14 @@
file_analysis = hybkit.analysis.Analysis(
analysis_types=['type', 'mirna'], name=in_file_label
)
combined_analysis = hybkit.analysis.Analysis(
analysis_types=['type', 'mirna'], name='Combined Analysis'
)

# Open one HybFile entry for reading, and one for writing
with hybkit.HybFile(in_file_path, 'r', hybformat_id=True) as in_file, \
hybkit.HybFile(out_file_path, 'w') as out_file:

# Track last record identifier, to use only one record per read
last_record_id = None

# Iterate over each record of the input file
for hyb_record in in_file:
hyb_record.eval_types() # Find segment types
Expand All @@ -74,10 +80,16 @@
if hyb_record.has_prop('any_seg_type_is', remove_type):
use_record = False
break

# If record has an excluded type, continue to next record without analyzing.
if not use_record:
continue

# If record is a duplicate, skip it.
elif hyb_record.id == last_record_id:
continue
last_record_id = hyb_record.id

# Set the dataset label for the record
hyb_record.set_flag('dataset', in_file_label)

Expand Down
3 changes: 2 additions & 1 deletion example_01_type_mirna_analysis/analysis_shell.sh
Expand Up @@ -9,7 +9,7 @@ NOTES="""
Analysis for type/mirna analysis performed using shell scripts.
Provided as an example of usage of hybkit shell executable scripts.
This will produce identical output to analysis_python.py version,
This will produce identical output to the analysis_python.py version,
though that implementation is more efficient.
See: 'README.rst' for this analysis for more information.
Expand Down Expand Up @@ -70,6 +70,7 @@ hyb_filter -i ${EVAL_FILES[*]} --verbose \
-o ${QC_FILES} \
--exclude any_seg_type_is rRNA \
--exclude_2 any_seg_type_is mitoch-rRNA \
--skip_dup_id_after \

# Cleanup intermediate files
rm -v ${OUT_DIR}/*evaluated*.hyb
Expand Down
@@ -1,6 +1,6 @@
mirna_analysis_count,445194
has_mirna,272840
non_mirna,172354
mirnas_5p,238924
mirnas_3p,11441
mirna_dimers,22475
mirna_analysis_count,209937
has_mirna,155572
non_mirna,54365
mirnas_5p,128134
mirnas_3p,6006
mirna_dimers,21432
@@ -1,12 +1,12 @@
mRNA,477847
miRNA,279775
pseudogene,92391
KSHV-miRNA,15540
pr-tr,10915
tRNA,7734
miscRNA,2482
snRNA,1413
lincRNA,1372
snoRNA,915
Ig,3
mRNA,194862
miRNA,167593
pseudogene,41245
KSHV-miRNA,9411
pr-tr,2870
tRNA,1684
miscRNA,683
lincRNA,612
snoRNA,497
snRNA,414
Ig,2
Trec,1
@@ -1,91 +1,89 @@
miRNA--mRNA,187311
mRNA--mRNA,112621
miRNA--pseudogene,31252
mRNA--pseudogene,20701
miRNA--miRNA,20260
pseudogene--mRNA,15354
KSHV-miRNA--mRNA,10060
pseudogene--pseudogene,9186
mRNA--miRNA,8716
miRNA--pr-tr,4579
tRNA--mRNA,3511
KSHV-miRNA--pseudogene,1920
pr-tr--mRNA,1909
tRNA--pseudogene,1616
mRNA--pr-tr,1540
miRNA--tRNA,1178
pseudogene--miRNA,1168
KSHV-miRNA--miRNA,1090
miRNA--KSHV-miRNA,1047
miRNA--miscRNA,829
miscRNA--mRNA,699
pr-tr--pr-tr,681
mRNA--KSHV-miRNA,587
mRNA--tRNA,575
miRNA--snRNA,562
pr-tr--pseudogene,560
miRNA--lincRNA,495
pseudogene--pr-tr,313
lincRNA--mRNA,312
tRNA--miRNA,310
snoRNA--mRNA,291
miscRNA--pseudogene,287
snRNA--mRNA,263
KSHV-miRNA--pr-tr,254
mRNA--lincRNA,252
pr-tr--miRNA,250
mRNA--miscRNA,228
mRNA--snRNA,213
miRNA--snoRNA,202
pseudogene--tRNA,194
snoRNA--pseudogene,132
miscRNA--miRNA,108
KSHV-miRNA--tRNA,107
pseudogene--miscRNA,98
KSHV-miRNA--snRNA,96
snRNA--pseudogene,96
pseudogene--snRNA,88
pseudogene--KSHV-miRNA,85
lincRNA--pseudogene,82
mRNA--snoRNA,82
miRNA--mRNA,102489
mRNA--mRNA,33159
miRNA--miRNA,19328
miRNA--pseudogene,16077
mRNA--pseudogene,7179
pseudogene--mRNA,6233
KSHV-miRNA--mRNA,5552
mRNA--miRNA,4580
pseudogene--pseudogene,4490
miRNA--pr-tr,1518
KSHV-miRNA--pseudogene,1109
KSHV-miRNA--miRNA,1027
miRNA--KSHV-miRNA,999
pseudogene--miRNA,736
tRNA--mRNA,679
miRNA--tRNA,364
pr-tr--mRNA,362
mRNA--pr-tr,321
mRNA--KSHV-miRNA,305
tRNA--pseudogene,284
miRNA--lincRNA,277
miRNA--snRNA,226
miscRNA--mRNA,216
miRNA--miscRNA,202
miRNA--snoRNA,149
snoRNA--mRNA,138
pr-tr--pseudogene,135
pr-tr--pr-tr,130
mRNA--tRNA,130
miscRNA--pseudogene,107
lincRNA--mRNA,107
tRNA--miRNA,98
pseudogene--pr-tr,92
mRNA--lincRNA,87
KSHV-miRNA--KSHV-miRNA,78
snoRNA--miRNA,68
tRNA--tRNA,65
miscRNA--miscRNA,62
lincRNA--miRNA,58
pseudogene--lincRNA,46
snoRNA--snoRNA,45
KSHV-miRNA--miscRNA,42
tRNA--pr-tr,35
tRNA--KSHV-miRNA,32
snRNA--miRNA,30
KSHV-miRNA--lincRNA,28
lincRNA--lincRNA,26
pseudogene--snoRNA,26
pr-tr--KSHV-miRNA,22
tRNA--miscRNA,22
lincRNA--pr-tr,19
miscRNA--pr-tr,16
pr-tr--miscRNA,15
snRNA--snRNA,14
pr-tr--tRNA,12
pr-tr--lincRNA,11
snoRNA--snRNA,11
pr-tr--snRNA,10
snRNA--miscRNA,8
KSHV-miRNA--pr-tr,78
pr-tr--miRNA,68
pseudogene--KSHV-miRNA,67
snoRNA--pseudogene,61
pseudogene--tRNA,58
snRNA--mRNA,52
miscRNA--miRNA,49
mRNA--snRNA,43
snoRNA--miRNA,42
mRNA--miscRNA,42
KSHV-miRNA--tRNA,36
snoRNA--snoRNA,28
mRNA--snoRNA,28
lincRNA--pseudogene,27
pseudogene--miscRNA,27
KSHV-miRNA--snRNA,24
pseudogene--snRNA,22
lincRNA--lincRNA,21
snRNA--pseudogene,21
lincRNA--miRNA,19
pseudogene--lincRNA,18
KSHV-miRNA--lincRNA,18
snRNA--miRNA,16
miscRNA--miscRNA,11
pseudogene--snoRNA,11
tRNA--KSHV-miRNA,11
pr-tr--KSHV-miRNA,10
tRNA--pr-tr,8
KSHV-miRNA--snoRNA,7
lincRNA--KSHV-miRNA,4
snoRNA--pr-tr,4
lincRNA--tRNA,4
tRNA--lincRNA,4
snRNA--pr-tr,4
lincRNA--miscRNA,3
snRNA--tRNA,2
miRNA--Ig,2
KSHV-miRNA--miscRNA,7
lincRNA--pr-tr,6
tRNA--tRNA,4
pr-tr--lincRNA,4
miscRNA--pr-tr,3
lincRNA--KSHV-miRNA,2
tRNA--miscRNA,2
snRNA--snRNA,2
miscRNA--KSHV-miRNA,2
tRNA--snRNA,2
pr-tr--tRNA,2
tRNA--lincRNA,2
snoRNA--snRNA,2
pr-tr--miscRNA,1
snRNA--tRNA,1
miscRNA--lincRNA,1
pseudogene--Ig,1
tRNA--snRNA,1
mRNA--Trec,1
pr-tr--snRNA,1
snoRNA--KSHV-miRNA,1
snoRNA--lincRNA,1
snoRNA--pr-tr,1
miRNA--Ig,1
lincRNA--miscRNA,1
snRNA--miscRNA,1

0 comments on commit d4d33a2

Please sign in to comment.