-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FASTA Reader: supported file endings #2054
Comments
I am afraid that's not so easy. The FormattedFile Class can be overloaded by a third template parameter. I hope that helps and works. struct OpenMSFastaAdpator_;
using OpenMSFastaAdpator = Tag<OpenMSFastaAdpator_>;
// Your custom file format.
struct FastaAdaptor_;
using FastaAdaptor = Tag<FastaAdaptor_>;
// List of valid input formats for your customized sequence file.
typedef
TagList<Fastq,
TagList<Fasta,
TagList<FastaAdaptor,
> > >
OpenMSSeqInFormats;
// Overloaded file format metafunction.
template <>
struct FileFormat<FormattedFile<Fastq, Input, OpenMSFastaAdaptor> >
{
typedef TagSelector<OpenMSSeqInFormats> Type;
};
// Specify the valid ending for your fasta adaptor:
template <typename T>
struct FileExtensions< FastaAdaptor, T>
{
static char const * VALUE[6];
};
template <typename T>
char const * FileExtensions< FastaAdaptor, T>::VALUE[1] =
{
".tmp" // fasta file with tmp ending.
};
// Overload the readRecord function:
template <typename TIdString, typename TSeqString, typename TFwdIterator>
inline SEQAN_FUNC_ENABLE_IF(Not<IsSameType<TFwdIterator, FormattedFile<Fastq, Input, OpenMSFastaAdaptor > > >, void)
readRecord(TIdString & meta, TSeqString & seq, TFwdIterator & iter, FastaAdaptor)
{
readRecord(meta, seq, iter, Fasta());
} |
@cbielow Does this solve your problem? |
@h-2 @rrahn A documented solution to this issue in ReadTheDocs would be very useful as this problem also affects developers who wish to create tools for Galaxy. When writing a Galaxy plugin you need to account for Galaxy changing all file formats to become a It would be nice to be able to accept any file extension (tmp/dat etc). Maybe using a flag, e.g. ./program --fastq 001.dat |
@martinjvickers There is no nice solution for this in the SeqAn2 code base and tbh I think any other solution will be bad usability wise for users (it should auto-detect!) and confusing for programmers: what if you set |
@h-2 I 100% agree with the current way that SeqAn treats file formats. For command line tools it makes perfect sense and I don't think that should change. I've found it quite annoying that Galaxy renames the file. However Galaxy is very popular so it has been important to ensure the tools I write can be used with it, leading to the need to write the terrible wrappers in Galaxy (e.g. symlinking the dat file to a fastq/fastq or whatever before running it, potentially leaving existing symlinks in place if something dies before they're removed). Your example shows exactly how you can't (and shouldn't) use SeqAn at the moment to support this, e.g. a However, I'm not convinced that this is necessary for SeqAn as a whole, but as you say a workaround is needed. It's quite telling as there is no documentation about creating Galaxy Workflows in the ReadTheDocs despite headers being in place. It's not easy to create a Galaxy workflow for a SeqAn tool following the Galaxy documentation because of this issue. Maybe on the Galaxy issue it shouldn't be to alter SeqAn but to demonstrate decent wrappers for SeqAn tools in the SeqAn docs. That probably doesn't help the OP @cbielow though. |
|
I hope this helps to develop file ending wrappers for your use cases. |
we'd like to upgrade from Seqan 1.6 to Seqan 2.x for OpenMS in the near future.
Currently,
readRecord()
is holding us back a little, since it seems to enforce certain file endings on FASTA input files, and refuses to read *.tmp files, which can occur in auto-generated class-test files or maybe even workflow systems. This is a deal breaker for us.Is there a way to add a flag in readRecord() or somewhere appropriate which switches off filename suffix restrictions?!
The text was updated successfully, but these errors were encountered: