contains R script transforming loosely formatted transcripts to CHAT conventional format. transcripts sources is the SES studies which are part of a large corpus of interviews conducted from 80ies-2010 by c.w. pfaff in the course of multilingual researches. the .cha (chat) file is generated from a transcript sample with exmaralda partitur editor to demonstrate how a standard conform transcript should be constructed.
- 20221023(22.47): generated .cha named output files of transformation; import to exmaralda partitur editor works cvd. the transformation is successful.
- 20230305(19.01): script (conc_essai) to create database of lemmatized corpus (via SketchEngine). database final allows corpus analysis independent of sketchengine framework. base for the sketchengine lemmatization are standardized transcripts created with above transformation script.