-
plot_signalp()
has been deprecated and removed. -
get_signalp5()
now works with https://services.healthtech.dtu.dk/service.php?SignalP-5.0 site since the old link http://www.cbs.dtu.dk/services/SignalP/ does not function any longer. -
get_signalp()
now works with https://services.healthtech.dtu.dk/service.php?SignalP-4.1 since the old link http://www.cbs.dtu.dk/services/SignalP-4.1/ does not function any longer. -
get_signalp()
now now runs one job at a time. -
get_signalp()
splitter argument default value has been changed to 1000. -
get_signalp()
sleep argument has been removed. -
get_targetp()
now works with https://services.healthtech.dtu.dk/service.php?TargetP-1.1 since the old link http://www.cbs.dtu.dk/services/TargetP-1.1/ does not function any longer. -
get_targetp()
now now runs one job at a time. -
get_targetp()
splitter argument default value has been changed to 1000. -
get_targetp()
sleep argument has been removed. .
- added new function
get_cdd()
which queries the the Conserved Domain Database (https://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi) - added new function
get_signalp5()
which queries SignalP5 web server (http://www.cbs.dtu.dk/services/SignalP) - added new function
get_tmhmm()
which queries TMHMM v. 2.0 web server (http://www.cbs.dtu.dk/services/TMHMM/) plot_prot()
nsp
argument can now be"signalp"
,"signalp5"
or"none"
. Default is"signalp5"
. This argument determines ifget_signalp()
orget_signalp5()
are used for N-sp prediction. Data.frame input is accepted as well.plot_prot()
tm
argument can now be"phobius"
,"tmhmm"
or"none"
. Default is"phobius"
. This argument determines ifget_phobius()
orget_tmhmm()
are used for TM prediction. Data.frame input is accepted as well.plot_prot()
domain
argument can now be"cdd"
,"hmm"
or"none"
. Default is"cdd"
. This argument determines ifget_hmm()
orget_cdd()
are used for domain annotation. Data.frame input is accepted as well.- all
get_*
andscan_*
functions, as well asmaab()
now work withAAStringSet
class objects. #5
get_signalp()
output contains an additional columnsp.length
- integer, length of the predicted signal peptide. This column is a copy ofYmax.pos
. Potentially BREAKING CHANGEget_phobius()
output contains an additional columnsp.length
- integer, length of the predicted signal peptide. Potentially BREAKING CHANGEget_phobius()
the name of the output column "Name
" has been changed to "id
". BREAKING CHANGEplot_prot()
default value forgpi
argument has been changed to "netgpi
".
-
get_netGPI()
now now runs one job at a time. -
get_netGPI()
splitter argument default value has been changed to 2500. -
get_targetp()
has been fixed to work withorg_type = "non_plant"
. -
get_espritz()
has been restored since Espritz server is again available at http://old.protein.bio.unipd.it/espritz/. -
plot_prot()
has new argumentsgpi_size
- controls the size of the gpi symbol andgpi_shape
- controls the shape of the gpi symbol. -
plot_prot()
has new argumenthyp_scan
- which is a logical and determines ifscan_ag()
(whenag = TRUE
) should scan for arabinogalactan motifs containing only predicted hydroxyprolines. This argument changes the previous default behavior ofplot_prot()
which was equivalent tohyp_scan = FALSE
. -
get_phobius()
andget_big_pi()
now use https web server addresses. #6
get_espritz()
has been removed since Espritz server is no longer available for obtaining predictions. BREAKING CHANGE.- xgboost models used by ragp were re-saved using xgboost 1.1.1.1 to increase compatibility.
- xgboost models used by ragp are now stored in the inst directory instead of internal sysdata.rda.
plot_prot()
argumentsnsp
,domain
,tm
,gpi
anddisorder
can also be user supplied data frames obtained by callingget_signalp()
,get_hmm()
,get_phobius()
,get_big_pi()
,get_pred_gpi()
,get_netGPI()
andget_espritz()
on the same sequences supplied toplot_prot()
.
get_big_pi()
output columnis.bigpi
has been renamed tois.gpi
to bring the output in line with other gpi predicting functions. BREAKING CHANGE.get_big_pi()
output column order has been changed to:id
,is.gpi
,Quality
,omega_site
andPValue
. BREAKING CHANGE.
-
New
get_netGPI()
queries NetGPI web server (https://services.healthtech.dtu.dk/service.php?NetGPI-1.0) for predictions of GPI-anchored proteins. -
maab()
argumentget_gpi
has an additional option"netgpi"
. When set, the function will query NetGPI web server to resolve ambiguities in maab classes depending on GPI-anchoring predictions. -
plot_prot()
argumentgpi
has an additional option"netgpi"
. When set, the function will query NetGPI web server for prediction of omega sites.
-
maab()
now correctly performs when only a single protein sequence is provided as an argument. -
get_hmm()
receives additional numeric arguments:ievalue
andbitscore
. These arguments are used to filter sequences with lower or equalievalue
and higher or equalbitscore
in the output. This is useful when used fromplot_prot()
to avoid plotting weakly identified domains.
-
get_big_pi()
andget_pred_gpi()
now returnNA
in respectiveis.bigpi
andis.gpi
columns when the servers are unable to make a prediction due to non-amino acid letters or length of the sequence. -
maab()
now correctly does not resolve class ambiguities when the logical vector provided asgpi
argument containsNA
values. Previously it returnedNA
asmaab_class
. This is useful whenmaab()
is called withget_gpi = 'predgpi'
orget_gpi = 'bigpi'
arguments, and the corresponding servers are unable to make a prediction for a sequence due to non-amino acid letters or length of the sequence.
predict_hyp()
internal model is updated to 2nd version ('V2'). Predictions are around 25% faster compared to the first version. The performance in terms of accuracy is similar based on the test set used. 'V2' was created using a more streamlined manner and is the default model. The old model ('V1') is still available using theversion
argument topredict_hyp()
.
-
predict_hyp()
now checks if all provided ids are unique. Previously non unique ids caused an error at the end of computation. -
predict_hyp()
sequence
output has changed for sequences containing non amino acid letters. Previously NA was returned for such sequences. At present all "P"" for which the probability is higher then the defined threshold (tprob
argument) are changed to "O"" and all others are left as "P".
-
maab()
now correctly outputs when there are no MAAB classes found andget_gpi
argument is set to"predgpi"
or"bigpi"
. Previously this caused an error due tostringr::write.fasta()
attempting to write a file with no sequences. -
pfam2go()
now takes Pfam > GO mappings from ftp://ftp.geneontology.org/pub/go/external2go/pfam2go instead of http://geneontology.org/external2go/pfam2go.
-
get_phobius()
,get_big_pi()
,get_pred_gpi()
,maab()
andplot_prot()
, gain an additional argumentprogress
.progress
is a logical value determining whether to show the progress bar, (defaultFALSE
). -
get_targetp()
,get_signalp()
andget_hmm()
argumentprogress
is now set toFALSE
at default. -
get_hmm()
argumentverbose
is now set toFALSE
at default. -
added
pkgdown
site forragp
at: https://missuse.github.io/ragp/.
get_targetp()
andget_signalp()
gain additional argumentsprogress
andattempts
.progress
is a logical value determining whether to show the progress bar, (defaultTRUE
).attempts
is an integer value determining the number of repeated attempts if server unresponsive (default 2). These functions now return finished queues if server becomes unresponsive.
get_big_pi()
,get_espritz()
,get_hmm()
,get_phobius()
,get_pred_gpi()
,get_signalp()
,get_targetp()
,maab()
,predict_hyp()
,scan_ag()
andscan_nglc()
now use the S3 object system which will make further extensions of accepted inputs straightforward.
-
Completely rewrote the code for
scan_ag()
which is now simplified and easier to read. The function is now about 20% slower. -
fixed a bug in
scan_ag()
when argumentexclude_ext = "all"
which resulted in detection of unwanted amino acids in certain sequence arrangements. The bug occurred only if AG glycomoduls were detected, it did not introduce any. If you used this function withexclude_ext = "all"
it is advisable to rerun these analyses again. -
removed deprecated functions:
get_signap_file()
,get_targetp_file()
,get_phobius_file()
.
-
New
get_espritz()
queries ESpritz web server for predictions on protein disordered regions. -
plot_prot()
gains an additional argumentdisorder
, logical indicating should the predicted disordered regions be plotted. Defaults toFALSE
.
- fixed a bug in function
plot_prot()
introduced in 0.1.0.0003 which prevented GPIs to be plotted whengpi = "bigpi"
.
- New
get_pred_gpi()
queries PredGPI web server for predictions on GPI presence and omega site location.
-
maab()
argumentget_gpi
has been changed and now accepts strings as input:get_gpi = c("bigpi", "predgpi", "none")
which indicate whether to query Big Pi or PredGPI server or not to resolve class ambiguities. -
plot_prot()
argumentgpi
has been changed and now accepts strings as inputgpi = c("bigpi", "predgpi", "none")
which indicate whether to query Big Pi or PredGPI server or not to plot GPI positions. -
fixed a bug in
plot_prot()
which caused extra-cellular regions to start at 0 instead of 1. -
fixed a bug in
get_big_pi()
when shorter than 55 amino acid sequences were provided.
- Removed rvest dependency
- Added vignette.
-
When using
plot_prot()
with argumentdom_sort = "ievalue"
the domains with the lowest independent e-value will now be correctly plotted on top. -
The ratio of x and y axes in
plot_prot()
output is now calculated based on sequence length and should provide more consistent diagrams. -
get_hmm()
can now handle protein sequences of arbitrary length by splitting them into several shorter overlapping sequences and querying hmmscan.
-
New
plot_prot()
returns aggplot2
diagram of protein structure based onhmmscan
domain annotation and several types of predictions. -
New
get_signalp()
,get_targetp()
andget_phobius()
replace deprecatedget_signalp_file()
,get_targetp_file()
andget_phobius_file()
. These functions accept R objects as well as FASTA files as input (#5). -
New
plot_signalp()
takes one single letter protein sequence and returns a detailed SignalP prediction along with a plot. -
New
scan_nglc()
detects motifs for N-glycosylation on asparagine residues. -
maab()
gain an additional logical argumentget_gpi
(defaultFALSE
); ifTRUE
get_big_pi
will be called on all sequences belonging to one of the HRGP classes thus resolving class ambiguities that depend on GPI knowledge. -
Added a
NEWS.md
file to track changes to the package.
-
Significantly improved the speed of
get_big_pi()
. Removed the verbose argument. Added a progress bar (#3). -
get_hmm()
gains new arguments:timeout
- time in seconds to wait for the server response (default is 10s).attempts
- number of attempts if server unresponsive (default is 2 times) (#3). -
get_signalp()
andget_targetp()
now perform 10 parallel queries instead of arbitrary many. This results in higher stability when running on many sequences, and a slight reduction in speed. These functions now have a progress bar. -
get_phobius()
gains a progress bar. -
Changed internal behavior when
trunc
argument is specified forget_signalp()
. Now the truncation is performed prior to sending the files to the server resulting in higher efficiency. -
scan_ag()
gains logicaltidy
argument. The specific output is available whensimplify = FALSE
andtidy = TRUE
in function call (#1).