Skip to content
cesine edited this page Aug 7, 2012 · 4 revisions

Your data is your data. We will never use your private data for any purpose. In addition we make extra steps to safe guard our couch database server so that other programmers can't get into your data unless you make it public.

If you make your corpus public, you can still encrypt some datum which is confidential. This allows you to open and share your corpus even when you know you have sensitive information in your corpus.

If you make your corpus public, we follow EMELD's requirements that it be search engine friendly (ie "discoverable"). If you don't want Google to index your data, make your corpus private. If you don't mind Google indexing your data, but want don't Googlers to trace the data back to the research project behind it, edit your Corpus's public view to hide any sensitive information such as research goals and project members.

If you make your corpus public, anyone on the web can use your data for their purposes (this includes community members, other linguists and programmers). If you want them to cite you when they use your data, put a license (MIT, CreativeCommons) on your Corpus's public view asking them to cite you or your project/lab as the data source.

You also don't need to make your whole corpus public. You can keep your corpus private, and create a data list which contains only the data that you want to make public (ie share in a widget on your wiki/blog/website).