### Uni Data Preparation

- https://pypi.org/project/spacy-langdetect/
- https://spacy.io/models
- https://github.com/explosion/spaCy/issues/11038

#### Installing packages

In [1]:
# Here we install the important packages for the analysis. By running this code, just remove the '#' sign before the code line of the library you want to install
# !pip install spacy
# !pip install spacy_langdetect
# !pip install googletrans==4.0.0rc1
# !python -m spacy download en
# !python -m spacy download de

#### Loading required libraries

In [2]:
# Load necessary packages
import json
import spacy
from spacy_langdetect import LanguageDetector
from spacy.language import Language
from googletrans import Translator

#### Merging the 10 university datasets

In [3]:
# Build the dataset based on the 10 scraped data from the university Paderborn modules website
with open("advanced_time_series_analysis.json", encoding = 'utf8') as time_series:
    data1 = json.load(time_series)
    
with open("applied_ml_for_text_analysis.json", encoding = 'utf8') as machine_learning:
    data2 = json.load(machine_learning)

with open("data_science_for_business.json", encoding = 'utf8') as data_science:
    data3 = json.load(data_science)

with open("deep_learning_in_social_media.json", encoding = 'utf8') as deep_learning:
    data4 = json.load(deep_learning)

with open("digital_markets.json", encoding = 'utf8' ) as digital_markets:
    data5 = json.load(digital_markets)
    
with open("econometrics.json", encoding = 'utf8') as econometrics:
    data6 = json.load(econometrics)
    
with open("methoden_der_data_science.json", encoding = 'utf8') as methoden_data_science:
    data7 = json.load(methoden_data_science)
    
with open("social_business_analytics.json", encoding = 'utf8') as social_business_analytics:
    data8 = json.load(social_business_analytics)
    
with open("statistical_learning.json", encoding = 'utf8') as statistical_learning:
    data9 = json.load(statistical_learning)
    
with open("topic_financial_data_science.json", encoding = 'utf8') as financial_data_science:
    data10 = json.load(financial_data_science)
        
uni_data = data1 + data2 + data3 + data4 + data5 + data6 + data7 + data8 + data9 + data10

# Take a look of the merged data
uni_data

['Inhalte (short description):',
 '\u200bThis is an advanced lecture in time series analysis developed based on basic knowledge in time series. Hence, one of our Master modules W4451 or Bachelor modules W2453, or a comparable module that you have visited at another university is a necessary requirement. The main topics of this module will be divided into two parts: Part 1: Advanced linear time series models, including the analysis of time series with seasonality and different calendar effects, multivariate time series models as well as long memory time series models; and Part 2: advanced topics of non-linear and functional time series, including long memory volatility and duration models, multivariate volatility and correlation models, volatility and correlation measures based on high-frequency financial data as well as the analysis of functional time series with short or long memory. The focuses are on the introduction of the theory and methods, practical implementation in R and their

In [4]:
# Save the data in json file named uni_data
with open('uni_data.json', 'w', encoding = 'utf8') as file:
    json.dump(uni_data, file, ensure_ascii = False)

#### Data processing and language detection with spaCy

In [5]:
# Load the pre-trained spaCy small models for English and German languages
nlp_en = spacy.load('en_core_web_sm')

In [6]:
# Define a language detector function
def language_detector(nlp, name):
    return LanguageDetector()

Language.factory("language_detector", func = language_detector)

<function __main__.language_detector(nlp, name)>

In [7]:
# Add the language detector to the pipeline
nlp_en.add_pipe('language_detector', last = True)

<spacy_langdetect.spacy_langdetect.LanguageDetector at 0x1f85419e070>

#### Translate the collected data from the university website

In [8]:
# Input text. In this case, we copy & paste the output of the uni_data we had above
text = """Inhalte (short description):", "​This is an advanced lecture in time series analysis developed based on basic knowledge in time series. Hence, one of our Master modules W4451 or Bachelor modules W2453, or a comparable module that you have visited at another university is a necessary requirement. The main topics of this module will be divided into two parts: Part 1: Advanced linear time series models, including the analysis of time series with seasonality and different calendar effects, multivariate time series models as well as long memory time series models; and Part 2: advanced topics of non-linear and functional time series, including long memory volatility and duration models, multivariate volatility and correlation models, volatility and correlation measures based on high-frequency financial data as well as the analysis of functional time series with short or long memory. The focuses are on the introduction of the theory and methods, practical implementation in R and their application in forecasting and decision making. Practical implementation in Python will be discussed as well, given that suitable Python packages are available. Application to economic and financial time series, particularly in sustainable economics and finance, will be strongly emphasised. Semiparametric extensions of corresponding approaches under non-stationary component time series models will be described as far as possible.", "Lernergebnisse (learning outcomes):", "Fachkompetenz Wissen (professional expertise):", "Studierende...​Advanced knowledge in time series and forecasting; advanced R/Python skills", "Fachkompetenz Fertigkeit (practical professional and academic skills):", "Studierende...can use advanced computational tools and sophisticated modern statistical approaches for illustrating and modeling and analysing different kinds of datagain skills to analyze big multivariate and functional data setsgain further knowledge about the programming language R and basic knowledge of Pythonimprove their computing, data illustration and data management skillsimprove their analytical and empirical study skills", "Personale Kompetenz / Sozial (individual competences / social skills):", "Studierende...cooperate and work in groupsability for carrying out a practically relevant projectdeep understanding of environmental time series", "Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously):", "Studierende...​gain more expertise and skills in scientific working and writinggain strong skills in modern data analysis and data scienceare further trained in independent and research related studying", "Inhalte (short description):", "Schätzungen zufolge sind heutzutage etwa 80% aller Daten unstrukturiert. Im Gegensatz zu strukturierten Daten, die wohlstrukturiert und inhaltlich meist numerisch sind, sind unstrukturierte Daten oft textuell und daher schwieriger zu interpretieren. Die Aufgabe, Wissen aus Textdokumenten zu extrahieren, bekannt als Textanalyse oder natürliches Sprachverständnis, ist äußerst komplex und immer noch begrenzt durch die Möglichkeiten von Computern, die Feinheiten menschlicher Sprachen zu verstehen.  In diesem Hands-on-Seminar werden die Studierenden in den aktuellen Stand des maschinellen Lernens und die Techniken der Verarbeitung natürlicher Sprache eingeführt (z.B. Textklassifikation, Themenmodellierung, künstliche neuronale Netze, Worteinbettungen). Durch Programmierübungen (Python) können die Studierenden nicht nur ihr theoretisches Wissen über verschiedene Algorithmen vertiefen, sondern haben auch die Möglichkeit, diese Methoden auf reale Probleme anzuwenden. It is estimated that approximately 80%of all existing data is unstructured. Unlike structured data, which is usuallywell-structured and mostly numerical, unstructured data is often textual andtherefore far more difficult to interpret. The task of extractingknowledge from text documents, known as text analysis or natural languageunderstanding, is extremely complex and still limited by the ability ofcomputers to understand the subtleties of human languages. In this hands-onseminar, students will be introduced to the current state of machine learningand natural language processing techniques (e.g. text classification, topicmodelling, artificial neural networks, word embeddings). With programmingexercises (Python), students deepen their theoretical knowledge of differentalgorithms and get the opportunity to apply these methods to real-world issues.​", "Lernergebnisse (learning outcomes):", "Fachkompetenz Wissen (professional expertise):", "Studierende...kennen die Herausforderungen bei der automatischen Analyse natürlich-sprachiger Textdatenkennen verschiedene Textanalyse-Techniken und können die zugrundeliegende Logik beschreibenkennen die Stärken und Schwächen spezifischer Textanalyse-Techniken​Students...​are aware of the challenges of automatically analysing natural language text dataknow different text analysis techniques and can describe the underlying logicknow the strengths and weaknesses of specific text analysis techniques", "Fachkompetenz Fertigkeit (practical professional and academic skills):", "Studierende...sammeln Textdaten aus dem Web oder unternehmensinternen Datenquellenbereinigen und transformieren Textdaten, um sie für statistische Analysen nutzbar zu machenwenden Textanalyse-Techniken auf einen vorgegebenen Datensatz an​​​Students...collect text data from the web or company data sourcescleanse and transform text data to make it usable for statistical analysesapply text analysis techniques to a given data set​", "Personale Kompetenz / Sozial (individual competences / social skills):", "Studierende...​lösenGeschäftsprobleme (z.B. im Marketing oder Servicemanagement) durch dieErfassung und Analyse von Textdaten (z.B. Online-Rezensionen, Social MediaBeiträge, E-Mails)        ​​Students...solve business problems (e.g. in marketing or service management) by collecting and analysing text data (e.g. online reviews, social media posts, emails)", "Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously):", "Studierende...wählen die richtige Text-Mining-Technik für ein vorgegebenes (bestimmtes) Problem ausbewerten die Qualität der Text-Mining-Ergebnissesind sich der Grenzen der automatisierten natürlichen Sprachverarbeitung bewusst​Students...select the correct text mining technique for a given (specific) problem.evaluate the quality of the text mining resultsare aware of the limitations of automated natural language processing", "Inhalte (short description):", "​Unter dem Begriff Data Science wird im Allgemeinen die Extraktion von Wissen aus großen Datenmengen verstanden. Typischerweise ist das Ziel von Data Science, durch das gewonnene Wissen die Effektivität und Effizienz von Entscheidungsprozessen zu verbessern. In diesem Modul werden grundlegende und fortgeschrittene Konzepte und Methoden der Data Science und Ihre Anwendung in den Wirtschaftswissenschaften behandelt. Der Fokus liegt dabei auf Verfahren des überwachten und unüberwachten maschinellen Lernens (z. B. lineare und logistische Regression, Random Forest, Boosted Decision Trees, Neuronale Netze, Clustering, Dimensionality Reduction). Es handelt sich bei diesem Modul um ein Flipped Classroom Modul (https://de.wikipedia.org/wiki/Umgedrehter_Unterricht), d. h. Studierende erarbeiten die theoretischen Inhalte durch Videos und Lehrbücher unterstützt im Selbststudium. Die Anwendung der erlernten Inhalte geschieht in Präsenzveranstaltungen anhand von praxisnahen Fragestellungen und Fallstudien. Zielgruppe des Moduls sind Studierende der Wirtschaftswissenschaften, die erste praktische Erfahrungen mit dem Thema Data Science machen wollen. Es wird die Bereitschaft zur Einarbeitung in die Programmiersprache R erwartet.The term Data Science generally describes the extraction of knowledge from large amounts of data, its goal to improve the effectiveness and efficiency of decision-making processes through the knowledge thus gained. The course covers basic and advanced concepts and methods of Data Science and their application in an economic context with a focus on supervised and unsupervised machine learning methods (e.g. linear and logistic regression, random forest, boosted decision trees, neural networks, clustering and dimensionality reduction). The course is organized according to the flipped-class-room concept (https://en.wikipedia.org/wiki/Flipped_classroom), i.e. students work out the theoretical contents supported by videos and textbooks in self-study and practising the application in classroom sessions based practical issues and case studies. The course is aimed at students of economics who want to gain first practical experience in the field of data science. Willingness to learn the R programming language is a basic requirement.​", "Lernergebnisse (learning outcomes):", "Fachkompetenz Wissen (professional expertise):", "Studierende...​kennen die Gemeinsamkeiten und Unterschiede von überwachten und unüberwachten Methoden des maschinellen Lernenskennen grundlegende lineare (insb. lineare und logistische Regression und deren Erweiterungen) und nicht-lineare Modelle (z.B. baumbasierte Verfahren, neuronale Netze) des maschinellen Lernens und können deren Funktionsweise erläuternkennen Methoden und Metriken zur Beurteilung der Qualität von Modellen des maschinellen Lernens​Students...​know common features and differences between supervised and unsupervised machine learning methods.know basic linear (esp. linear and logistic regression and their respective extensions) and non-linear models (e.g. tree-based methods, neural networks) of machine learning and can explain how they workknow methods and metrics for assessing the quality of machine learning models", "Fachkompetenz Fertigkeit (practical professional and academic skills):", "Studierende...​wenden verschiedene Verfahren des maschinellen Lernens zur Erklärung und Vorhersage von wirtschaftlichen Phänomenen anevaluieren die Qualität von überwachten und unüberwachten Modellen des maschinellen LernensStudents...​apply various machine learning techniques to explain and predict economic phenomenaevaluate the quality of supervised and unsupervised machine learning models", "Personale Kompetenz / Sozial (individual competences / social skills):", "Studierende...​lösen Übungen und Fallstudien gemeinsam in PräsenzveranstaltungenStudents..​​​​solve exercise tasks and case studies together in classroom sessions", "Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously):", "Studierende...​erarbeiten Lerninhalte selbstständig zu Hause mit Hilfe eines Lehrbuchs, begleitenden Videos und Präsentationsfolien​Students...independently work out course content at home with the help of a textbook, accompanying videos and presentation slides", "Inhalte (short description):", "In the past few years, the popularity of the social media has grown remarkably, with constantly growing up amounts of users sharing all kinds of information through different platforms. More users mean more data to be mined. Therefore, it is vital for marketing organizations to be aware of how people express their opinions and how their feedback can affect their business. This has given rise to Social Media Analytics in order to extract business insight and value from consumer data. Social Media Analysis is a broad concept consisting of Social Network Analysis, Machine learning, Data Mining, Information Retrieval, and Natural Language Processing. Deep learning techniques as a subfield of machine learning enable machines to learn by themselves to classify and cluster the data. The fact that the data contained in social media are highly unstructured named as Big Data, makes deep learning an extremely valuable tool for companies to manipulate the data. These companies use Deep Neural Networks as the foundation stones of deep learning, to decide which concept could be interesting to which customers.To cover the concepts of deep learning in Social Media, this course starts with theoretical explanation of machine learning and data mining. In order to mine the huge number of user-contributed materials (e.g. photographs, videos, and textual context) in social media, different types of Artificial Deep Neural Networks (ADNN) will be explained. Simultaneously, the course covers basics of Python programming language and its data science libraries to equip the students with the tools which are needed to take advantage of the wealth of Big Data.Later, different deep learning algorithms such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTM), Restricted Boltzmann Machines (RBM), Autoencoders (AE) and Self-organizing Maps (SOM) will be introduced. More specifically the time series analysis, text- and sentiment analysis, image analysis and recommender systems using Deep Neural Networks will be presented. Finally, the topic of Social Recommender Systems (SRS) will be discussed, in which the social relations can be potentially exploited to improve the performance of online recommender systems. To finish the course fruitfully, some mini-projects regarding mining different data derived from social media platforms will be handed to the students.The notes will be declared at the end of the semester", "Lernergebnisse (learning outcomes):", "Fachkompetenz Wissen (professional expertise):", "Studierende...… restate the recent technological evolution and related academic works in the field of social media analytics using state-of-the-art deep learning algorithms,… conduct relevant technology-driven methods to exploit insight from unstructured social media data,… integrate approaches of data-driven analysis and their benefits towards solving business problems,… inspect the inevitable importance of social networks to get a deeper insight in management problems,… realize the importance of customers’ and users’ data to create business value and insights,… conduct research in the field of social media and collaborative technologies,…develop an analytical approach to address a research or managerial problem in the social media field.", "Fachkompetenz Fertigkeit (practical professional and academic skills):", "Studierende...… recognize the concepts of Artificial Intelligence, Machine Learning, Data Mining, and Deep Learning in social media,… describe different paradigms of data mining such as supervised and unsupervised learning,… differentiate various types of data in the field of social media and the relative algorithms to analyze them,… explain the theory of Artificial Neural Networks and its training procedure,… differentiate different deep learning algorithms in social media data (e.g., Convolutional Neural Network for image analysis, Recurrent Neural Networks for text analysis, …)… exploit social media data and analyze them with deep learning algorithms to get insights for specific business problems,... interpret the result of data analysis,… apply different deep learning algorithms on social media data in Python environment using data mining libraries like Numpy, Pandas, Matplotlib, Tensorflow, Scikit-learn, and data visualization tools,…experiment data from most popular social media platforms, including Twitter, Facebook, Google+, StackOverflow, Blogger, YouTube and more,... analyze current research contributions in the context of deep learning in social media.", "Personale Kompetenz / Sozial (individual competences / social skills):", "Studierende...… formulate management-oriented problems in a social business context and address them in a systematic approach based on standard methods of scientific data and content analysis to derive practical implications,… develop cooperative skills in group-based task analysis, report creation, documentation, and presentation,… observe a critical outlook over shared data in social media,… underline the importance of the social media mining in each and everyday life.", "Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously):", "Studierende...… develop a critical and informed perspective on the benefits of different deep learning analytical algorithms and their characteristics,… distinguish appropriate deep learning algorithms to best address a given business problem,… interpret and evaluate the quality and the implications of the research results for practitioners and academics,… conduct a systematic academic research on a predefined topic enabled with theoretical background and practical applications.", "Inhalte (short description):", "This course introduces students to basic concepts related to electronic exchanges and information related to these exchanges. Such information includes transactional data (such as orders and trades) and news. The course will also cover how to analyze the impact of news on a market with a focus on financial markets. In addition to the methodology required to analyze such data, students will also be introduced to various software tools that support such an analysis.The course is offered by our guest lecturer Prof. Fethi Rabhi (UNSW, Syndey, Australia).", "Lernergebnisse (learning outcomes):", "Fachkompetenz Wissen (professional expertise):", "Studierende...learn terms to describe and analyze financial exchanges.", "Fachkompetenz Fertigkeit (practical professional and academic skills):", "Studierende...can analyze financial market data and news.", "Personale Kompetenz / Sozial (individual competences / social skills):", "Studierende...​work in groups and discuss implications of thelearned content.", "Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously):", "Studierende...can apply data analysis techniques to different markets.can assess the impact of news on market prices.", "Inhalte (short description):", "​This module provides the students fundamentalknowledge of quantitative methods in empirical economic research atintroductory and anvanced level. The focus is on the theory, estimation andapplication of simple and multiple linear regression models. After a systematicintroduction to econometrics, selected special topics, such asmulticollinerity, heteroskedasticity, model selection and models with timeseries errors, will be dealt with in details. A brief introduction to theanalysis of panel data will be provided as far as possible. The course iscomputer supported and will be provided with a lot of real data examples.Numerical examples in the lectures and tutorials will be dealt with the publicpowerful programing language R. During the visit of this modul you will also beintroduced to the use of R in statistics and econometrics.", "Lernergebnisse (learning outcomes):", "Fachkompetenz Wissen (professional expertise):", "Studierende...​acquire systematic knowledge of the theory and application of linear regression; fundamental knowledge of special problems and methods to solve them.advanced knowledge of statistical estimation and test theory; knowledge of mathematical modelling; programing skills; teamwork ability.", "Fachkompetenz Fertigkeit (practical professional and academic skills):", "Studierende...​well known econometric models; model selection; simulation technique in econometrics; knowledge of statistical programing.", "Personale Kompetenz / Sozial (individual competences / social skills):", "Studierende...", "Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously):", "Studierende...​Training in modeling, presentation of ownresults, internet search, training of selflearning,cooperation and team working skills, improved computing skills, basic researchtraining.", "Inhalte (short description):", "In unserer vernetzten Welt werden in bisher ungekannter Art und Weise Daten generiert und gesammelt. Data Science (Film) bezeichnet die Extraktion von Wissen aus diesen Daten. Das Modul vermittelt grundlegende Konzepte und Methoden entlang des Lebenszyklus eines Data Science Projektes, von der Formulierung der Problemstellung über die Sammlung, Vorbereitung und Visualisierung der Daten bis hin zur Erkennung von Mustern und Trends in diesen mittels Verfahren des maschinellen Lernens (z. B. Regression, Klassifikation, Clustering). Das erlernte Methodenwissen wird kontinuierlich durch praxisnahe Übungen mit der Programmiersprache R angewandt und vertieft. Das Modul umfasst eine Vorlesung sowie eine Übung.", "Lernergebnisse (learning outcomes):", "Fachkompetenz Wissen (professional expertise):", "Studierende...Die Studierenden …… kennen typische Datenqualitätsprobleme und können diese beschreiben… kennen verschiedene Diagramme zur Darstellung quantitativer Daten und können deren Vor- und Nachteile wiedergeben… kennen einfache Modelle des maschinellen Lernens und können deren Funktionsweise erläutern", "Fachkompetenz Fertigkeit (practical professional and academic skills):", "Studierende...… bereiten Rohdaten zur anschließenden Visualisierung und statistischen Analyse auf… visualisieren quantitative Daten mittels Diagrammen… wenden verschiedene Verfahren des maschinellen Lernens zur Erkennung von Mustern und Trends in quantitativen Daten an", "Personale Kompetenz / Sozial (individual competences / social skills):", "Studierende...​… lösen betriebswirtschaftlicheProblemstellungen durch die Anwendung von Data Science Methoden", "Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously):", "Studierende...… evaluieren die Qualität von Rohdaten… wählen die passenden Methoden zur Visualisierung und statistischen Analyse gegebener Datensätze aus… bewerten die Qualität von Modellen des maschinellen Lernens", "Inhalte (short description):", "​Die verbreitete Nutzung digitaler sozialer Medien durch vernetzte Akteure oder Organisationen hinterlässt eine nie zuvor gekannte Menge digitaler Daten. Diese werden zunehmend durch Unternehmen oder Wissenschaftler genutzt, um die komplexen Abläufe im Web2.0 besser zu verstehen und gegebenenfalls besser auf Kunden einwirken zu können. Die Forschung und auch zahlreiche Unternehmen entwickelten diverse analytische Ansätze, um aus den Massenrohdaten sinnvolle und wirtschaftlich relevante Einsichten zu erzeugen. Social Media Manager beispielsweise verwenden aggregierte Darstellungen der Daten in Dashboards, um den Erfolg ihrer Arbeit (z.B. verbessernde Kundenwahrnehmung, Vertrieb) zu messen. Forscher identifizieren generelle Muster und entwickeln Metriken und Theorien.Um sich diesem datenzentrierten Arbeits- und Forschungskontext zu nähern und die Metriken und Einsichten zu erweitern, wird in diesem Modul zunächst das Konzept des Social Business als ein relevanter organisatorischer Kontext vorgestellt. Eine wichtige Rolle hierbei spielen die Managementwerkzeuge von Social Media Managern, welche die firmenrelevanten digitalen Aktivitätsdaten der Onlinenutzer in den sozialen Medien aufbereiten, verdichten und visualisieren. Parallel verwenden die Manager direkte Antworten und Reaktionen ihrer Kundengruppe als qualitative Daten für ihre Analysen. Auf Basis dieser Managementperspektive werden im Modul dann verschiedene Ansätze von Social Media Analytics besprochen und angewendet. Beispiele sind die Erstellung von Personas, Genreanalysen, Community Health Analysen, Time-Series Analysen, Event-Impact Analysen oder Netzwerkanalysen. Parallel zur Untersuchung einiger praktischer Fallbeispiele entwickeln die Teilnehmer ein eigenes Analyseprojekt auf Basis der besprochenen Methoden. Ziele sind hierbei, die komplexen digitalen Phänomene besser zu verstehen, unternehmensrelevante Einsichten für Social Media Manager zu generieren sowie eventuell darüberhinausgehend einen Forschungsbeitrag zu entwickeln (z.B. neue Metriken, Visualisierung oder Aufdeckung genereller Phänomene, Designs). Der zu erstellende Projektbericht basiert auf der Struktur und den Methoden wissenschaftlicher Artikel und ermöglicht dadurch den Teilnehmern, anschließende akademische Arbeiten (z.B. Masterarbeit, eigene Publikationen) auf den Ergebnissen des Moduls aufzubauen.", "Lernergebnisse (learning outcomes):", "Fachkompetenz Wissen (professional expertise):", "Studierende...​… lernen neue wissenschaftliche Erkenntnisse und Artikel im Themenkomplex Social Media / Kooperative Technologien kennen… lernen Ansätze der Datensammlung im Vorfeld der Social Media Forschung… kennen Verfahren zur wissenschaftlichen Datenanalyse und InterpretationStudents learn about… current scientific insights and articles in the social media and collaborative technologies field,…approaches of social media data collection and transformation as a base for social media research,…approaches of scientific data analysis and interpretation.", "Fachkompetenz Fertigkeit (practical professional and academic skills):", "Studierende...​… analysieren aktuelle Forschungsbeiträge im Kontext von Social Media und kooperativen Technologien,… konzipieren einen eigenen Forschungszugang zum Thema,… erfassen und generieren ein Datenset als Ausgangspunkt wissenschaftlicher Analysen,… wenden Werkzeuge zur Datenanalyse und Interpretation an,… entwickeln einen systematischen Ansatz zum Aufbau und zur Strukturierung eines eigenen akademischen Forschungsprojekts (z.B. als Vorstufe zur Masterarbeit).Students……analyse current research in the field of social media and collaborative technologies,…generate their own transformed social media dataset to fit for research inquiries,…develop a systematic analytical approach to address a research or management problem (e.g., as a precursor for a master thesis).", "Personale Kompetenz / Sozial (individual competences / social skills):", "Studierende...​… können ungeklärte Fragestellungen aus Forscherperspektive zielorientiert und abstrakt formulieren und mit systematischen Standardmethoden der Daten bzw. Inhaltsanalyse kritisch untersuchen, sowie praktische Implikationen ableiten.Students……are enabled to formulate management-oriented problems in a social business context and address them in a systematic approach based on standard methods of scientific data and content analysis to derive practical implications.", "Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously):", "Studierende...​… entwickeln eine kritische Perspektive auf neueste technische Entwicklungen,… wählen die passenden Methoden zur Analyse gegebener Fragestellungen aus,… bewerten die Qualität und Generalisierbarkeit der Ergebnisse und ihrer Implikationen für die Forschung und Praxis.Students…… develop a critical and informed perspective on the benefits of different software-based analytical methods and tools,… can choose in an informed manner appropriate tools and methods to best address a given business problem,… can evaluate the quality and the implications of the research results for practitioners and academics", "Inhalte (short description):", "​This module introduces the students to Data Science and one ofthe main sub-area of Data Science, e.g. Statistical Learning, as well as theprogramming languages R and Python. Covered topics of this course are e.g. abrief introduction to Data Science, an Introduction to StatisticalLearning,  Linear Regression, Classification,Cross-Validation and Resampling Methods, Model Selection using StepwiseRegression and Regularization using Ridge Regression and LASSO, RegressionSplines, Non-parametric Regression, Trees-Based Decision, Baggin, Boosting, RandomForest, Support Vector Machines and Unsupervised Learning(if possible). Thecourse is structured into three parts: Part 1 – An Introduction to DataScience and statistical Learning, and an overview of the purpose, theorganization, main topics as well as the assessment of this module. Part 2 - Introduction to fundamentalsof Statistical Learning. Main contents of this part are basic and advanced conceptslike Simple Linear Regression, Multiple Linear Regression, PolynomialRegression, Logistic Regression, Linear Discriminant Analysis, QuadraticDiscriminant Analysis, K-Nearest Neighbours, Cross Validation, Bootstrap andStepwise Regression. Part 3 – Introduction to advancedfundamentals of Statistical Learning. In this part the focus lies on more sophisticatedconcepts like Ridge Regression, Lasso, Principal Component Regression, PartialLeast Squares, Regression Splines, Generalized Additive Models (GAMs), RegressionTrees, Classification Trees, Bagging, Random Forests, Boosting, Maximal MarginClassifiers, Support Vector Classifiers and Support Vector Machines. Furtherpossible topics of unsupervised learning are e.g. Principal ComponentsAnalysis, K-Means Cluster analysis and Hierarchical Cluster Analysis.Please note that the topics of the seminar projects should be on the application of statistical learning approaches to financial and economic data.", "Lernergebnisse (learning outcomes):", "Fachkompetenz Wissen (professional expertise):", "Studierende...understanding of modern Data Sciencegain fundamental knowledge of Data Science, related problems and methods to solve them.learn different advanced and modern approaches in Statistics and Econometric. understanding the relationship between Statistics, Econometrics and Data Science.understanding the roll of Econometrics in Data Science and vice versa.  learn further advanced concepts of supervised Statistical- and Machine Learning.learn further advanced concepts of unsupervised Statistical- and Machine Learning.", "Fachkompetenz Fertigkeit (practical professional and academic skills):", "Studierende...the ability to use basic and sophisticated Statistical Learning concepts.gain skills of computer intensive data analysing and for model selection. gain skills to collect, manage, visualize and analyse large and complex data sets.gain advanced knowledge about the programming language R.gain basic knowledge about the programming language Python", "Personale Kompetenz / Sozial (individual competences / social skills):", "Studierende...improve further skills of problem definition and problem solution gain ability for managing and implementation of a small empirical study projectimprove cooperative and team-work ability.improve the ability for presenting own resultsgain communication and conversation skills.", "Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously):", "Studierende...​gain ability of self-learninggain more expertise in scientific working.obtain further training in independent studying. improve computing data analysis skills.Improve ability for writing a detailed project report.", "Inhalte (short description):", "This is an advanced seminar in data science which particularly covers the areas of modern statistical and econometric approaches as well as statistical and machine learning. Basic topics are on the application of suitable algorithms in those areas for modeling economic and financial data, especially for and forecasting economic and financial time series, based on known research results in the literature. For this purpose, new tools in modern areas of statistics and econometrics, such as local polynomial regression, P-Splines, quantile regression and functional data analysis, should be considered. Further new tools in recurrent neural networks, deep learning and reinforcement learning should be employed. Modelling and forecasting multivariate time series using proper adaptations of the above-mentioned approaches will also be studied. For high-level or research oriented seminar works more advanced topics, e.g. the extension of currently used methods in the literature for semiparametric modeling of long memory time series, deep learning of multivariate, functional or high-frequency financial and economic time series as well as Machine Learning algorithms for big financial and economic data can be offered.", "Lernergebnisse (learning outcomes):", "Fachkompetenz Wissen (professional expertise):", "Studierende...​", "Fachkompetenz Fertigkeit (practical professional and academic skills):", "Studierende...​the ability to use basic and sophisticated Statistical Learning concepts.gain skills of computer intensive data analysing and for model selection.gain skills to collect, manage, visualize and analyse large and complex data sets.gain advanced knowledge about the programming language R.gain basic knowledge about the programming language Python", "Personale Kompetenz / Sozial (individual competences / social skills):", "Studierende...improve further skills of problem definition and problem solutiongain ability for managing and implementation of a small empirical study projectimprove cooperative and team-work ability.improve the ability for presenting own resultsgain communication and conversation skills.", "Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously):", "Studierende...​gain ability of self-learninggain more expertise in scientific working.obtain further training in independent studying.improve computing data analysis skills.Improve ability for writing a detailed project report."""

In [9]:
# Process our input text to obtain a spaCy document for the language detection
doc = nlp_en(text.replace('",', " ").replace('"', " ").replace("   ", " ").replace("...", " "))
doc

Inhalte (short description): ​This is an advanced lecture in time series analysis developed based on basic knowledge in time series. Hence, one of our Master modules W4451 or Bachelor modules W2453, or a comparable module that you have visited at another university is a necessary requirement. The main topics of this module will be divided into two parts: Part 1: Advanced linear time series models, including the analysis of time series with seasonality and different calendar effects, multivariate time series models as well as long memory time series models; and Part 2: advanced topics of non-linear and functional time series, including long memory volatility and duration models, multivariate volatility and correlation models, volatility and correlation measures based on high-frequency financial data as well as the analysis of functional time series with short or long memory. The focuses are on the introduction of the theory and methods, practical implementation in R and their applicatio

In [10]:
# Print the average language of the document
print(doc._.language)

{'language': 'en', 'score': 0.9999963690668028}


In [11]:
# Show the sentence level language detection
for i, sent in enumerate(doc.sents):
    print(sent, sent._.language)

Inhalte (short description): ​This is an advanced lecture in time series analysis developed based on basic knowledge in time series. {'language': 'en', 'score': 0.9999971214382961}
Hence, one of our Master modules W4451 or Bachelor modules W2453, or a comparable module that you have visited at another university is a necessary requirement. {'language': 'en', 'score': 0.9999952621329767}
The main topics of this module will be divided into two parts: Part 1: Advanced linear time series models, including the analysis of time series with seasonality and different calendar effects, multivariate time series models as well as long memory time series models; and Part 2: advanced topics of non-linear and functional time series, including long memory volatility and duration models, multivariate volatility and correlation models, volatility and correlation measures based on high-frequency financial data as well as the analysis of functional time series with short or long memory. {'language': 'en'

Studierende erarbeiten die theoretischen {'language': 'de', 'score': 0.9999978659802343}
Inhalte durch Videos und Lehrbücher unterstützt im Selbststudium. {'language': 'de', 'score': 0.9999959175831643}
Die Anwendung der erlernten Inhalte geschieht in Präsenzveranstaltungen anhand von praxisnahen Fragestellungen und Fallstudien. {'language': 'de', 'score': 0.9999957698563378}
Zielgruppe des Moduls sind Studierende der Wirtschaftswissenschaften, die erste praktische Erfahrungen mit dem Thema Data Science machen wollen. {'language': 'de', 'score': 0.9999975135759598}
Es wird die Bereitschaft zur Einarbeitung in die Programmiersprache R erwartet. {'language': 'de', 'score': 0.9999978664578851}
The term Data Science generally describes the extraction of knowledge from large amounts of data, its goal to improve the effectiveness and efficiency of decision-making processes through the knowledge thus gained. {'language': 'en', 'score': 0.9999987839408038}
The course covers basic and advanced 

Personale Kompetenz / Sozial (individual competences / social skills): Studierende … formulate management-oriented problems in a social business context and address them in a systematic approach based on standard methods of scientific data and content analysis to derive practical implications,… develop cooperative skills in group-based task analysis, report creation, documentation, and presentation,… observe a critical outlook over shared data in social media,… underline the importance of the social media mining in each and everyday life. {'language': 'en', 'score': 0.9999970757894696}
Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): {'language': 'en', 'score': 0.7142837879341045}
Studierende … develop a critical and informed perspective on the benefits of different deep learning analytical algorithms and their characteristics,… distinguish appropriate deep learning algorithms to best address a given business problem,… interpret and ev

Der zu erstellende Projektbericht basiert auf der Struktur und den Methoden wissenschaftlicher Artikel und ermöglicht dadurch den Teilnehmern, anschließende akademische Arbeiten (z.B. Masterarbeit, eigene Publikationen) {'language': 'de', 'score': 0.9999950054617374}
auf den Ergebnissen des Moduls aufzubauen. {'language': 'de', 'score': 0.9999977673037612}
Lernergebnisse (learning outcomes): {'language': 'en', 'score': 0.571428806196788}
Fachkompetenz Wissen (professional expertise): Studierende ​… lernen neue wissenschaftliche Erkenntnisse und Artikel im Themenkomplex Social Media / Kooperative Technologien kennen… lernen Ansätze der Datensammlung im Vorfeld der Social Media Forschung… {'language': 'de', 'score': 0.9999973496448986}
kennen Verfahren zur wissenschaftlichen Datenanalyse und InterpretationStudents learn about… current scientific insights and articles in the social media and collaborative technologies field,…approaches of social media data collection and transformation as

Fachkompetenz Wissen (professional expertise): {'language': 'en', 'score': 0.85714157538512}
Studierende ​ Fachkompetenz Fertigkeit (practical professional and academic skills): {'language': 'en', 'score': 0.5714277525702355}
Studierende ​the ability to use basic and sophisticated Statistical Learning concepts.gain skills of computer intensive data analysing and for model selection.gain skills to collect, manage, visualize and analyse large and complex data sets.gain advanced knowledge about the programming language R.gain basic knowledge about the programming language Python Personale Kompetenz / Sozial (individual competences / social skills): Studierende improve further skills of problem definition and problem solutiongain ability for managing and implementation of a small empirical study projectimprove cooperative and team-work ability.improve the ability for presenting own resultsgain communication and conversation skills. {'language': 'en', 'score': 0.9999979909826306}
Personale 

In [12]:
# This piece of code is only if you want to check a sentence in the document with index [x]. For example, uni[0] displays the first sentence  
uni_web = list(doc.sents)
uni_web[0]

Inhalte (short description): ​This is an advanced lecture in time series analysis developed based on basic knowledge in time series.

In [13]:
# Split the input text into sentences. We set "." in the .split() attribute as our sentences are separated by dots
sentences = text.split(".")

# Count the number of sentences in each language
en_count = 0
de_count = 0
for sentence in sentences:
    doc = nlp_en(sentence)
    if doc._.language["language"] == "en":
        en_count += 1
    else:
        de_count += 1

In [14]:
# Calculate the proportion of each language
en_proportion = en_count / (en_count + de_count)
de_proportion = de_count / (en_count + de_count)

# Print the results
print("English proportion: {:.2f}%".format(en_proportion * 100))
print("German proportion: {:.2f}%".format(de_proportion * 100))

English proportion: 44.48%
German proportion: 55.52%


#### Translation from English to German

In [15]:
# Define a translation function
def translate_text(text, target_language = 'de'):
    translator = Translator(service_urls = ['translate.google.com'])
    translated_text = translator.translate(text, dest = target_language).text
    return translated_text

In [16]:
# Set up the text to be translate. In this case, we copy and paste the output of the processed spaCy doc we had above
text = "Inhalte (short description): ​This is an advanced lecture in time series analysis developed based on basic knowledge in time series. Hence, one of our Master modules W4451 or Bachelor modules W2453, or a comparable module that you have visited at another university is a necessary requirement. The main topics of this module will be divided into two parts: Part 1: Advanced linear time series models, including the analysis of time series with seasonality and different calendar effects, multivariate time series models as well as long memory time series models; and Part 2: advanced topics of non-linear and functional time series, including long memory volatility and duration models, multivariate volatility and correlation models, volatility and correlation measures based on high-frequency financial data as well as the analysis of functional time series with short or long memory. The focuses are on the introduction of the theory and methods, practical implementation in R and their application in forecasting and decision making. Practical implementation in Python will be discussed as well, given that suitable Python packages are available. Application to economic and financial time series, particularly in sustainable economics and finance, will be strongly emphasised. Semiparametric extensions of corresponding approaches under non-stationary component time series models will be described as far as possible. Lernergebnisse (learning outcomes): Fachkompetenz Wissen (professional expertise): Studierende ​Advanced knowledge in time series and forecasting; advanced R/Python skills Fachkompetenz Fertigkeit (practical professional and academic skills): Studierende can use advanced computational tools and sophisticated modern statistical approaches for illustrating and modeling and analysing different kinds of datagain skills to analyze big multivariate and functional data setsgain further knowledge about the programming language R and basic knowledge of Pythonimprove their computing, data illustration and data management skillsimprove their analytical and empirical study skills Personale Kompetenz / Sozial (individual competences / social skills): Studierende cooperate and work in groupsability for carrying out a practically relevant projectdeep understanding of environmental time series Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): Studierende ​gain more expertise and skills in scientific working and writinggain strong skills in modern data analysis and data scienceare further trained in independent and research related studying Inhalte (short description): Schätzungen zufolge sind heutzutage etwa 80% aller Daten unstrukturiert. Im Gegensatz zu strukturierten Daten, die wohlstrukturiert und inhaltlich meist numerisch sind, sind unstrukturierte Daten oft textuell und daher schwieriger zu interpretieren. Die Aufgabe, Wissen aus Textdokumenten zu extrahieren, bekannt als Textanalyse oder natürliches Sprachverständnis, ist äußerst komplex und immer noch begrenzt durch die Möglichkeiten von Computern, die Feinheiten menschlicher Sprachen zu verstehen.  In diesem Hands-on-Seminar werden die Studierenden in den aktuellen Stand des maschinellen Lernens und die Techniken der Verarbeitung natürlicher Sprache eingeführt (z.B. Textklassifikation, Themenmodellierung, künstliche neuronale Netze, Worteinbettungen). Durch Programmierübungen (Python) können die Studierenden nicht nur ihr theoretisches Wissen über verschiedene Algorithmen vertiefen, sondern haben auch die Möglichkeit, diese Methoden auf reale Probleme anzuwenden. It is estimated that approximately 80%of all existing data is unstructured. Unlike structured data, which is usuallywell-structured and mostly numerical, unstructured data is often textual andtherefore far more difficult to interpret. The task of extractingknowledge from text documents, known as text analysis or natural languageunderstanding, is extremely complex and still limited by the ability ofcomputers to understand the subtleties of human languages. In this hands-onseminar, students will be introduced to the current state of machine learningand natural language processing techniques (e.g. text classification, topicmodelling, artificial neural networks, word embeddings). With programmingexercises (Python), students deepen their theoretical knowledge of differentalgorithms and get the opportunity to apply these methods to real-world issues.​ Lernergebnisse (learning outcomes): Fachkompetenz Wissen (professional expertise): Studierende kennen die Herausforderungen bei der automatischen Analyse natürlich-sprachiger Textdatenkennen verschiedene Textanalyse-Techniken und können die zugrundeliegende Logik beschreibenkennen die Stärken und Schwächen spezifischer Textanalyse-Techniken​Students ​are aware of the challenges of automatically analysing natural language text dataknow different text analysis techniques and can describe the underlying logicknow the strengths and weaknesses of specific text analysis techniques Fachkompetenz Fertigkeit (practical professional and academic skills): Studierende sammeln Textdaten aus dem Web oder unternehmensinternen Datenquellenbereinigen und transformieren Textdaten, um sie für statistische Analysen nutzbar zu machenwenden Textanalyse-Techniken auf einen vorgegebenen Datensatz an​​​Students collect text data from the web or company data sourcescleanse and transform text data to make it usable for statistical analysesapply text analysis techniques to a given data set​ Personale Kompetenz / Sozial (individual competences / social skills): Studierende ​lösenGeschäftsprobleme (z.B. im Marketing oder Servicemanagement) durch dieErfassung und Analyse von Textdaten (z.B. Online-Rezensionen, Social MediaBeiträge, E-Mails)    ​​Students solve business problems (e.g. in marketing or service management) by collecting and analysing text data (e.g. online reviews, social media posts, emails) Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): Studierende wählen die richtige Text-Mining-Technik für ein vorgegebenes (bestimmtes) Problem ausbewerten die Qualität der Text-Mining-Ergebnissesind sich der Grenzen der automatisierten natürlichen Sprachverarbeitung bewusst​Students select the correct text mining technique for a given (specific) problem.evaluate the quality of the text mining resultsare aware of the limitations of automated natural language processing Inhalte (short description): ​Unter dem Begriff Data Science wird im Allgemeinen die Extraktion von Wissen aus großen Datenmengen verstanden. Typischerweise ist das Ziel von Data Science, durch das gewonnene Wissen die Effektivität und Effizienz von Entscheidungsprozessen zu verbessern. In diesem Modul werden grundlegende und fortgeschrittene Konzepte und Methoden der Data Science und Ihre Anwendung in den Wirtschaftswissenschaften behandelt. Der Fokus liegt dabei auf Verfahren des überwachten und unüberwachten maschinellen Lernens (z. B. lineare und logistische Regression, Random Forest, Boosted Decision Trees, Neuronale Netze, Clustering, Dimensionality Reduction). Es handelt sich bei diesem Modul um ein Flipped Classroom Modul (https://de.wikipedia.org/wiki/Umgedrehter_Unterricht), d. h. Studierende erarbeiten die theoretischen Inhalte durch Videos und Lehrbücher unterstützt im Selbststudium. Die Anwendung der erlernten Inhalte geschieht in Präsenzveranstaltungen anhand von praxisnahen Fragestellungen und Fallstudien. Zielgruppe des Moduls sind Studierende der Wirtschaftswissenschaften, die erste praktische Erfahrungen mit dem Thema Data Science machen wollen. Es wird die Bereitschaft zur Einarbeitung in die Programmiersprache R erwartet.The term Data Science generally describes the extraction of knowledge from large amounts of data, its goal to improve the effectiveness and efficiency of decision-making processes through the knowledge thus gained. The course covers basic and advanced concepts and methods of Data Science and their application in an economic context with a focus on supervised and unsupervised machine learning methods (e.g. linear and logistic regression, random forest, boosted decision trees, neural networks, clustering and dimensionality reduction). The course is organized according to the flipped-class-room concept (https://en.wikipedia.org/wiki/Flipped_classroom), i.e. students work out the theoretical contents supported by videos and textbooks in self-study and practising the application in classroom sessions based practical issues and case studies. The course is aimed at students of economics who want to gain first practical experience in the field of data science. Willingness to learn the R programming language is a basic requirement.​ Lernergebnisse (learning outcomes): Fachkompetenz Wissen (professional expertise): Studierende ​kennen die Gemeinsamkeiten und Unterschiede von überwachten und unüberwachten Methoden des maschinellen Lernenskennen grundlegende lineare (insb. lineare und logistische Regression und deren Erweiterungen) und nicht-lineare Modelle (z.B. baumbasierte Verfahren, neuronale Netze) des maschinellen Lernens und können deren Funktionsweise erläuternkennen Methoden und Metriken zur Beurteilung der Qualität von Modellen des maschinellen Lernens​Students ​know common features and differences between supervised and unsupervised machine learning methods.know basic linear (esp. linear and logistic regression and their respective extensions) and non-linear models (e.g. tree-based methods, neural networks) of machine learning and can explain how they workknow methods and metrics for assessing the quality of machine learning models Fachkompetenz Fertigkeit (practical professional and academic skills): Studierende ​wenden verschiedene Verfahren des maschinellen Lernens zur Erklärung und Vorhersage von wirtschaftlichen Phänomenen anevaluieren die Qualität von überwachten und unüberwachten Modellen des maschinellen LernensStudents ​apply various machine learning techniques to explain and predict economic phenomenaevaluate the quality of supervised and unsupervised machine learning models Personale Kompetenz / Sozial (individual competences / social skills): Studierende ​lösen Übungen und Fallstudien gemeinsam in PräsenzveranstaltungenStudents..​​​​solve exercise tasks and case studies together in classroom sessions Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): Studierende ​erarbeiten Lerninhalte selbstständig zu Hause mit Hilfe eines Lehrbuchs, begleitenden Videos und Präsentationsfolien​Students independently work out course content at home with the help of a textbook, accompanying videos and presentation slides Inhalte (short description): In the past few years, the popularity of the social media has grown remarkably, with constantly growing up amounts of users sharing all kinds of information through different platforms. More users mean more data to be mined. Therefore, it is vital for marketing organizations to be aware of how people express their opinions and how their feedback can affect their business. This has given rise to Social Media Analytics in order to extract business insight and value from consumer data. Social Media Analysis is a broad concept consisting of Social Network Analysis, Machine learning, Data Mining, Information Retrieval, and Natural Language Processing. Deep learning techniques as a subfield of machine learning enable machines to learn by themselves to classify and cluster the data. The fact that the data contained in social media are highly unstructured named as Big Data, makes deep learning an extremely valuable tool for companies to manipulate the data. These companies use Deep Neural Networks as the foundation stones of deep learning, to decide which concept could be interesting to which customers.To cover the concepts of deep learning in Social Media, this course starts with theoretical explanation of machine learning and data mining. In order to mine the huge number of user-contributed materials (e.g. photographs, videos, and textual context) in social media, different types of Artificial Deep Neural Networks (ADNN) will be explained. Simultaneously, the course covers basics of Python programming language and its data science libraries to equip the students with the tools which are needed to take advantage of the wealth of Big Data.Later, different deep learning algorithms such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTM), Restricted Boltzmann Machines (RBM), Autoencoders (AE) and Self-organizing Maps (SOM) will be introduced. More specifically the time series analysis, text- and sentiment analysis, image analysis and recommender systems using Deep Neural Networks will be presented. Finally, the topic of Social Recommender Systems (SRS) will be discussed, in which the social relations can be potentially exploited to improve the performance of online recommender systems. To finish the course fruitfully, some mini-projects regarding mining different data derived from social media platforms will be handed to the students.The notes will be declared at the end of the semester Lernergebnisse (learning outcomes): Fachkompetenz Wissen (professional expertise): Studierende … restate the recent technological evolution and related academic works in the field of social media analytics using state-of-the-art deep learning algorithms,… conduct relevant technology-driven methods to exploit insight from unstructured social media data,… integrate approaches of data-driven analysis and their benefits towards solving business problems,… inspect the inevitable importance of social networks to get a deeper insight in management problems,… realize the importance of customers’ and users’ data to create business value and insights,… conduct research in the field of social media and collaborative technologies,…develop an analytical approach to address a research or managerial problem in the social media field. Fachkompetenz Fertigkeit (practical professional and academic skills): Studierende … recognize the concepts of Artificial Intelligence, Machine Learning, Data Mining, and Deep Learning in social media,… describe different paradigms of data mining such as supervised and unsupervised learning,… differentiate various types of data in the field of social media and the relative algorithms to analyze them,… explain the theory of Artificial Neural Networks and its training procedure,… differentiate different deep learning algorithms in social media data (e.g., Convolutional Neural Network for image analysis, Recurrent Neural Networks for text analysis, …)… exploit social media data and analyze them with deep learning algorithms to get insights for specific business problems,  interpret the result of data analysis,… apply different deep learning algorithms on social media data in Python environment using data mining libraries like Numpy, Pandas, Matplotlib, Tensorflow, Scikit-learn, and data visualization tools,…experiment data from most popular social media platforms, including Twitter, Facebook, Google+, StackOverflow, Blogger, YouTube and more,  analyze current research contributions in the context of deep learning in social media. Personale Kompetenz / Sozial (individual competences / social skills): Studierende … formulate management-oriented problems in a social business context and address them in a systematic approach based on standard methods of scientific data and content analysis to derive practical implications,… develop cooperative skills in group-based task analysis, report creation, documentation, and presentation,… observe a critical outlook over shared data in social media,… underline the importance of the social media mining in each and everyday life. Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): Studierende … develop a critical and informed perspective on the benefits of different deep learning analytical algorithms and their characteristics,… distinguish appropriate deep learning algorithms to best address a given business problem,… interpret and evaluate the quality and the implications of the research results for practitioners and academics,… conduct a systematic academic research on a predefined topic enabled with theoretical background and practical applications. Inhalte (short description): This course introduces students to basic concepts related to electronic exchanges and information related to these exchanges. Such information includes transactional data (such as orders and trades) and news. The course will also cover how to analyze the impact of news on a market with a focus on financial markets. In addition to the methodology required to analyze such data, students will also be introduced to various software tools that support such an analysis.The course is offered by our guest lecturer Prof. Fethi Rabhi (UNSW, Syndey, Australia). Lernergebnisse (learning outcomes): Fachkompetenz Wissen (professional expertise): Studierende learn terms to describe and analyze financial exchanges. Fachkompetenz Fertigkeit (practical professional and academic skills): Studierende can analyze financial market data and news. Personale Kompetenz / Sozial (individual competences / social skills): Studierende ​work in groups and discuss implications of thelearned content. Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): Studierende can apply data analysis techniques to different markets.can assess the impact of news on market prices. Inhalte (short description): ​This module provides the students fundamentalknowledge of quantitative methods in empirical economic research atintroductory and anvanced level. The focus is on the theory, estimation andapplication of simple and multiple linear regression models. After a systematicintroduction to econometrics, selected special topics, such asmulticollinerity, heteroskedasticity, model selection and models with timeseries errors, will be dealt with in details. A brief introduction to theanalysis of panel data will be provided as far as possible. The course iscomputer supported and will be provided with a lot of real data examples.Numerical examples in the lectures and tutorials will be dealt with the publicpowerful programing language R. During the visit of this modul you will also beintroduced to the use of R in statistics and econometrics. Lernergebnisse (learning outcomes): Fachkompetenz Wissen (professional expertise): Studierende ​acquire systematic knowledge of the theory and application of linear regression; fundamental knowledge of special problems and methods to solve them.advanced knowledge of statistical estimation and test theory; knowledge of mathematical modelling; programing skills; teamwork ability. Fachkompetenz Fertigkeit (practical professional and academic skills): Studierende ​well known econometric models; model selection; simulation technique in econometrics; knowledge of statistical programing. Personale Kompetenz / Sozial (individual competences / social skills): Studierende  Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): Studierende ​Training in modeling, presentation of ownresults, internet search, training of selflearning,cooperation and team working skills, improved computing skills, basic researchtraining. Inhalte (short description): In unserer vernetzten Welt werden in bisher ungekannter Art und Weise Daten generiert und gesammelt. Data Science (Film) bezeichnet die Extraktion von Wissen aus diesen Daten. Das Modul vermittelt grundlegende Konzepte und Methoden entlang des Lebenszyklus eines Data Science Projektes, von der Formulierung der Problemstellung über die Sammlung, Vorbereitung und Visualisierung der Daten bis hin zur Erkennung von Mustern und Trends in diesen mittels Verfahren des maschinellen Lernens (z. B. Regression, Klassifikation, Clustering). Das erlernte Methodenwissen wird kontinuierlich durch praxisnahe Übungen mit der Programmiersprache R angewandt und vertieft. Das Modul umfasst eine Vorlesung sowie eine Übung. Lernergebnisse (learning outcomes): Fachkompetenz Wissen (professional expertise): Studierende Die Studierenden …… kennen typische Datenqualitätsprobleme und können diese beschreiben… kennen verschiedene Diagramme zur Darstellung quantitativer Daten und können deren Vor- und Nachteile wiedergeben… kennen einfache Modelle des maschinellen Lernens und können deren Funktionsweise erläutern Fachkompetenz Fertigkeit (practical professional and academic skills): Studierende … bereiten Rohdaten zur anschließenden Visualisierung und statistischen Analyse auf… visualisieren quantitative Daten mittels Diagrammen… wenden verschiedene Verfahren des maschinellen Lernens zur Erkennung von Mustern und Trends in quantitativen Daten an Personale Kompetenz / Sozial (individual competences / social skills): Studierende ​… lösen betriebswirtschaftlicheProblemstellungen durch die Anwendung von Data Science Methoden Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): Studierende … evaluieren die Qualität von Rohdaten… wählen die passenden Methoden zur Visualisierung und statistischen Analyse gegebener Datensätze aus… bewerten die Qualität von Modellen des maschinellen Lernens Inhalte (short description): ​Die verbreitete Nutzung digitaler sozialer Medien durch vernetzte Akteure oder Organisationen hinterlässt eine nie zuvor gekannte Menge digitaler Daten. Diese werden zunehmend durch Unternehmen oder Wissenschaftler genutzt, um die komplexen Abläufe im Web2.0 besser zu verstehen und gegebenenfalls besser auf Kunden einwirken zu können. Die Forschung und auch zahlreiche Unternehmen entwickelten diverse analytische Ansätze, um aus den Massenrohdaten sinnvolle und wirtschaftlich relevante Einsichten zu erzeugen. Social Media Manager beispielsweise verwenden aggregierte Darstellungen der Daten in Dashboards, um den Erfolg ihrer Arbeit (z.B. verbessernde Kundenwahrnehmung, Vertrieb) zu messen. Forscher identifizieren generelle Muster und entwickeln Metriken und Theorien.Um sich diesem datenzentrierten Arbeits- und Forschungskontext zu nähern und die Metriken und Einsichten zu erweitern, wird in diesem Modul zunächst das Konzept des Social Business als ein relevanter organisatorischer Kontext vorgestellt. Eine wichtige Rolle hierbei spielen die Managementwerkzeuge von Social Media Managern, welche die firmenrelevanten digitalen Aktivitätsdaten der Onlinenutzer in den sozialen Medien aufbereiten, verdichten und visualisieren. Parallel verwenden die Manager direkte Antworten und Reaktionen ihrer Kundengruppe als qualitative Daten für ihre Analysen. Auf Basis dieser Managementperspektive werden im Modul dann verschiedene Ansätze von Social Media Analytics besprochen und angewendet. Beispiele sind die Erstellung von Personas, Genreanalysen, Community Health Analysen, Time-Series Analysen, Event-Impact Analysen oder Netzwerkanalysen. Parallel zur Untersuchung einiger praktischer Fallbeispiele entwickeln die Teilnehmer ein eigenes Analyseprojekt auf Basis der besprochenen Methoden. Ziele sind hierbei, die komplexen digitalen Phänomene besser zu verstehen, unternehmensrelevante Einsichten für Social Media Manager zu generieren sowie eventuell darüberhinausgehend einen Forschungsbeitrag zu entwickeln (z.B. neue Metriken, Visualisierung oder Aufdeckung genereller Phänomene, Designs). Der zu erstellende Projektbericht basiert auf der Struktur und den Methoden wissenschaftlicher Artikel und ermöglicht dadurch den Teilnehmern, anschließende akademische Arbeiten (z.B. Masterarbeit, eigene Publikationen) auf den Ergebnissen des Moduls aufzubauen. Lernergebnisse (learning outcomes): Fachkompetenz Wissen (professional expertise): Studierende ​… lernen neue wissenschaftliche Erkenntnisse und Artikel im Themenkomplex Social Media / Kooperative Technologien kennen… lernen Ansätze der Datensammlung im Vorfeld der Social Media Forschung… kennen Verfahren zur wissenschaftlichen Datenanalyse und InterpretationStudents learn about… current scientific insights and articles in the social media and collaborative technologies field,…approaches of social media data collection and transformation as a base for social media research,…approaches of scientific data analysis and interpretation. Fachkompetenz Fertigkeit (practical professional and academic skills): Studierende ​… analysieren aktuelle Forschungsbeiträge im Kontext von Social Media und kooperativen Technologien,… konzipieren einen eigenen Forschungszugang zum Thema,… erfassen und generieren ein Datenset als Ausgangspunkt wissenschaftlicher Analysen,… wenden Werkzeuge zur Datenanalyse und Interpretation an,… entwickeln einen systematischen Ansatz zum Aufbau und zur Strukturierung eines eigenen akademischen Forschungsprojekts (z.B. als Vorstufe zur Masterarbeit).Students……analyse current research in the field of social media and collaborative technologies,…generate their own transformed social media dataset to fit for research inquiries,…develop a systematic analytical approach to address a research or management problem (e.g., as a precursor for a master thesis). Personale Kompetenz / Sozial (individual competences / social skills): Studierende ​… können ungeklärte Fragestellungen aus Forscherperspektive zielorientiert und abstrakt formulieren und mit systematischen Standardmethoden der Daten bzw. Inhaltsanalyse kritisch untersuchen, sowie praktische Implikationen ableiten.Students……are enabled to formulate management-oriented problems in a social business context and address them in a systematic approach based on standard methods of scientific data and content analysis to derive practical implications. Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): Studierende ​… entwickeln eine kritische Perspektive auf neueste technische Entwicklungen,… wählen die passenden Methoden zur Analyse gegebener Fragestellungen aus,… bewerten die Qualität und Generalisierbarkeit der Ergebnisse und ihrer Implikationen für die Forschung und Praxis.Students…… develop a critical and informed perspective on the benefits of different software-based analytical methods and tools,… can choose in an informed manner appropriate tools and methods to best address a given business problem,… can evaluate the quality and the implications of the research results for practitioners and academics Inhalte (short description): ​This module introduces the students to Data Science and one ofthe main sub-area of Data Science, e.g. Statistical Learning, as well as theprogramming languages R and Python. Covered topics of this course are e.g. abrief introduction to Data Science, an Introduction to StatisticalLearning,  Linear Regression, Classification,Cross-Validation and Resampling Methods, Model Selection using StepwiseRegression and Regularization using Ridge Regression and LASSO, RegressionSplines, Non-parametric Regression, Trees-Based Decision, Baggin, Boosting, RandomForest, Support Vector Machines and Unsupervised Learning(if possible). Thecourse is structured into three parts: Part 1 – An Introduction to DataScience and statistical Learning, and an overview of the purpose, theorganization, main topics as well as the assessment of this module. Part 2 - Introduction to fundamentalsof Statistical Learning. Main contents of this part are basic and advanced conceptslike Simple Linear Regression, Multiple Linear Regression, PolynomialRegression, Logistic Regression, Linear Discriminant Analysis, QuadraticDiscriminant Analysis, K-Nearest Neighbours, Cross Validation, Bootstrap andStepwise Regression. Part 3 – Introduction to advancedfundamentals of Statistical Learning. In this part the focus lies on more sophisticatedconcepts like Ridge Regression, Lasso, Principal Component Regression, PartialLeast Squares, Regression Splines, Generalized Additive Models (GAMs), RegressionTrees, Classification Trees, Bagging, Random Forests, Boosting, Maximal MarginClassifiers, Support Vector Classifiers and Support Vector Machines. Furtherpossible topics of unsupervised learning are e.g. Principal ComponentsAnalysis, K-Means Cluster analysis and Hierarchical Cluster Analysis.Please note that the topics of the seminar projects should be on the application of statistical learning approaches to financial and economic data. Lernergebnisse (learning outcomes): Fachkompetenz Wissen (professional expertise): Studierende understanding of modern Data Sciencegain fundamental knowledge of Data Science, related problems and methods to solve them.learn different advanced and modern approaches in Statistics and Econometric. understanding the relationship between Statistics, Econometrics and Data Science.understanding the roll of Econometrics in Data Science and vice versa.  learn further advanced concepts of supervised Statistical- and Machine Learning.learn further advanced concepts of unsupervised Statistical- and Machine Learning. Fachkompetenz Fertigkeit (practical professional and academic skills): Studierende the ability to use basic and sophisticated Statistical Learning concepts.gain skills of computer intensive data analysing and for model selection. gain skills to collect, manage, visualize and analyse large and complex data sets.gain advanced knowledge about the programming language R.gain basic knowledge about the programming language Python Personale Kompetenz / Sozial (individual competences / social skills): Studierende improve further skills of problem definition and problem solution gain ability for managing and implementation of a small empirical study projectimprove cooperative and team-work ability.improve the ability for presenting own resultsgain communication and conversation skills. Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): Studierende ​gain ability of self-learninggain more expertise in scientific working.obtain further training in independent studying. improve computing data analysis skills.Improve ability for writing a detailed project report. Inhalte (short description): This is an advanced seminar in data science which particularly covers the areas of modern statistical and econometric approaches as well as statistical and machine learning. Basic topics are on the application of suitable algorithms in those areas for modeling economic and financial data, especially for and forecasting economic and financial time series, based on known research results in the literature. For this purpose, new tools in modern areas of statistics and econometrics, such as local polynomial regression, P-Splines, quantile regression and functional data analysis, should be considered. Further new tools in recurrent neural networks, deep learning and reinforcement learning should be employed. Modelling and forecasting multivariate time series using proper adaptations of the above-mentioned approaches will also be studied. For high-level or research oriented seminar works more advanced topics, e.g. the extension of currently used methods in the literature for semiparametric modeling of long memory time series, deep learning of multivariate, functional or high-frequency financial and economic time series as well as Machine Learning algorithms for big financial and economic data can be offered. Lernergebnisse (learning outcomes): Fachkompetenz Wissen (professional expertise): Studierende ​ Fachkompetenz Fertigkeit (practical professional and academic skills): Studierende ​the ability to use basic and sophisticated Statistical Learning concepts.gain skills of computer intensive data analysing and for model selection.gain skills to collect, manage, visualize and analyse large and complex data sets.gain advanced knowledge about the programming language R.gain basic knowledge about the programming language Python Personale Kompetenz / Sozial (individual competences / social skills): Studierende improve further skills of problem definition and problem solutiongain ability for managing and implementation of a small empirical study projectimprove cooperative and team-work ability.improve the ability for presenting own resultsgain communication and conversation skills. Personale Kompetenz / Selbstständigkeit (individual competences / ability to perform autonomously): Studierende ​gain ability of self-learninggain more expertise in scientific working.obtain further training in independent studying.improve computing data analysis skills.Improve ability for writing a detailed project report."

In [17]:
# Set up the size of each chunk to 4500 as googletrans library has a limit of 5000 characters
chunk_size = 4500

# Split the text into chunks of 4500 characters
text_chunks = [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]

# Initialize an empty string to store the translated text
translated_text = ""

# Translate each chunk and concatenate the translated chunks
for chunk in text_chunks:
    translated_text += translate_text(chunk)

# Print the translated text
print(translated_text)

INHALTE (Kurzbeschreibung): Dies ist eine fortschrittliche Vorlesung in der Zeitreihenanalyse, die auf Grundkenntnissen in Zeitreihen basiert.Daher ist eines unserer Master -Module W4451 oder Bachelor -Module W2453 oder ein vergleichbares Modul, das Sie an einer anderen Universität besucht haben, eine notwendige Anforderung.Die Hauptthemen dieses Moduls werden in zwei Teile unterteilt: Teil 1: Erweiterte lineare Zeitreihenmodelle, einschließlich der Analyse von Zeitreihen mit Saisonalität und unterschiedlichen Kalendereffekten, multivariaten Zeitreihenmodellen sowie Long Memory Time -Series -Modellen.und Teil 2: Fortgeschrittene Themen der nichtlinearen und funktionalen Zeitreihen, einschließlich Volatilitäts- und Dauermodelle mit langer Speicher, multivariate Volatilitäts- und Korrelationsmodelle, Volatilitäts- und Korrelationsmessungen, die auf hochfrequenten Finanzdaten basieren, sowie die Analyse der funktionalen Zeitreihen mitkurzer oder langer Speicher.Die Fokussierungen liegen a

In [18]:
# Write the dictionary to a JSON file named "translated_uni_data.json"
with open('translated_uni_web_data.json', 'w', encoding = 'utf8') as outfile:
    json.dump(translated_text, outfile, ensure_ascii = False)