Skip to content

Commit

Permalink
Description typos
Browse files Browse the repository at this point in the history
  • Loading branch information
MarcinKosinski committed Jul 23, 2019
1 parent 9214002 commit 8d975d0
Show file tree
Hide file tree
Showing 61 changed files with 64 additions and 64 deletions.
2 changes: 1 addition & 1 deletion API/automating_google_slides_creation.md
Expand Up @@ -2,7 +2,7 @@

Author: Piotrek Ciurus (Azimo)

# Descripition
# Description

I would like to talk about automating Google Slide creation using R. First, complete data workflow will be presented. Second, two possible ways will be reviewed: exporting data file with automation using Google Apps Script and direct slide generation from R script. Finally, I will present practical example of business application.

Expand Up @@ -2,7 +2,7 @@

Author: Florent Bourgeois (University of Toulouse, Laboratoire de Genie Chimique)

# Descripition
# Description

With Excel being the computing tool most used by the engineering community, developing Excel applications that call R functions is highly desirable for engineers as it merges Excel's interactivity with a high level numerical environment. This paper was written with engineering trainers in mind. It should provide them with an applied and illustrative guide for easy development of applications that merge Excel and R using BERT as the interoperability solution. Simple examples are provided that exemplify the ease with which such applications can be created. Such applications, which are interactive by design since they use Excel as their front-end, can help engineering educators increase the attractiveness and dynamics of their engineering courses.

2 changes: 1 addition & 1 deletion API/google_page_speed_with_r.md
Expand Up @@ -2,7 +2,7 @@

Author: Leszek Sieminski (Ringier Axel Springer Polska)

# Descripition
# Description

One of more both important and tedious tasks in digital marketing is to optimize website loading. Firstly, introduce the concept and show what tools are required to perform an analysis (with an example). Secondly, I will describe how to enhance the tools' capabilities by using web API and show good practices on a real-world package. Finally, I will also shortly describe the architecture of the code and how to use the results to improve website loading time.

Expand Down
Expand Up @@ -4,7 +4,7 @@ Author: Jaroslaw Chilimoniuk (University of Wroclaw)

Co-authors: Michal Burdukiewicz, Piotr Sobczyk Stefan R?diger, Malgorzata Kotulska and Pawel Mackiewicz

# Descripition
# Description

Amyloids are proteins associated with important clinical disorders (e.g., Alzheimer's or Creutzfeldt-Jakob"s diseases). Despite their great diversity, all amyloid proteins can undergo their aggregation initiated by 6- to 15-residue segments. The structure and the function of proteins are encoded in the linear sequence of amino acids. But the aggregation propensity seems to not depend on the exact amino acid residues, but rather on their physicochemical properties. Therefore, we created a model of amyloidogenicity incorporating this knowledge.

Expand Down
Expand Up @@ -2,7 +2,7 @@

Author: Olga Kaminska (Britenet)

# Descripition
# Description

State change of patients in bipolar disorder may cause irreversible changes.
The aim of the project was to predict the state change towards depression / mania in the examined patients. Each person goes through this disease differently, that's why personalization of the algorithm is so important. The study used methods of both supervised and unsupervised learning through data from real patients. The prepared solution is the first step to create a complete prognostic system that would make life easier for doctors and patients.
Expand Down
Expand Up @@ -4,7 +4,7 @@ Author: Glowacka Jagoda (Transition Technologies S.A.)

Co-authors: Kamil Sijko, Konrad Wojdan

# Descripition
# Description

In EPISTOP project, 101 patients with TSC mutation causing uncontrolled growth of benign tumors were followed up since birth till 2 year of life to observe the epileptogenesis process. The blood were sampled from those patients in 3 or 4 defined moment of seizures development. After that, all of the samples were sequenced. Moreover, EEG, MRI and neuropsychological studies were performed to asses patients? condition and the clinical data were collected. Together 33 TB of data were gathered and more than 60 thousands of features were tested to select potential signs of epilepsy, given the patient condition and the outcome at 2 year of life. Multiple steps were performed in order to extract pattern from the high-dimensional data. Eventually, the goal of the analysis were to correctly point out those patients with increased risk of developing epilepsy in first 2 years of life.

Expand Down
Expand Up @@ -2,7 +2,7 @@

Author: Jakub Weiner (Revolution Train)

# Descripition
# Description

The talk will revolve around the data analytical approach to the phenomena of drug expositure of Central European (CZ, GE, PL, SK) youth as described by the data collected within Revolution Train project and the external sources. Furthermore, a proposed always-on cloud & blockchain infrastructure will be presented with a view on building a solid research approach for the innovations in primary prevention.

2 changes: 1 addition & 1 deletion Bio/r_in_ministry.md
Expand Up @@ -4,7 +4,7 @@ Author: Piotr Nowosielski (Ministry of Health Republic of Poland)

Co-authors: Michal Walczynski, Mariusz Zieba, Klaudiusz Witczak, Filip Wojciechowski

# Descripition
# Description

The R language was introduced in the Analyses and Strategies Department of the Ministry of Health in 2015.
Since then MS Excel is no longer used as an analytic tool.
Expand Down
Expand Up @@ -2,7 +2,7 @@

Author: Richard Louden (The Oakland Group)

# Descripition
# Description

Analytics without reproducibility, the ability to reproduce an output from its component parts, results in inherent risk. This is especially true in a business environment where staff can and will move to new jobs, leaving projects and work that may be vital for the business. In addition, analytics without collaboration can lead to wholly unsuitable results. This collaboration may come in the form of utilising different programming languages or additional input regarding business context. Both aspects require strong ways of working and suitable toolsets in order to be effective, which will be the main subject of this talk. Utilising previous experience of large businesses, from a management consulting and retail background, I aim to show how reproducibility can be improved with some simple methodologies, and collaboration aided with recent tools. Examples of such includes establishing a strong base to work from via utilising projects and working with paths correctly and improving collaboration with colleagues using tools such as reticulate and the draft redoc package.
The overall aim of this talk is to inspire those who do not currently utilise such practices to improve both their own workflows and those utilised within their company, for improved reproducibility and collaboration.
Expand Down
Expand Up @@ -2,7 +2,7 @@

Author: Francois Jacquet (artinlean sp. zoo)

# Descripition
# Description

I would like to share the journey I took to build from scratch a production grade machine learning workflow for automated stock and sales management, with emphasis on :

Expand Down
Expand Up @@ -2,7 +2,7 @@

Author: Lidia Kolakowska (Sotrender)

# Descripition
# Description

Cleaning and preparing data for analysis is one of the most time-consuming stages in the analyst's work. This process is further prolonged and can be frightening when data is available as nested lists. An example can be data received for analysis in JSON format, downloaded directly from API or non-relational database such as MongoDB.

Expand All @@ -13,4 +13,4 @@ By jumping to a higher level with time-consuming code processing, I will show ex
Tag words: nested lists, json data format, iterating over two list at same time, dealing with NULLs in nested lists, filtering lists elements, parallel processing, processing in pipelines, joining data from lists without primary keys


main packages: dplyr, purrr, future, furrr, fs, jsonlite
main packages: dplyr, purrr, future, furrr, fs, jsonlite
2 changes: 1 addition & 1 deletion EDA/maste_r_of_tables.md
Expand Up @@ -2,7 +2,7 @@

Author: Tomasz Żółtak (Educational Research Institute, Warsaw, Poland)

# Descripition
# Description

There is a widespread opinion that preparing good looking tables in R is hard. That's not true! Simply some great tools to work with tables in R are not so widely known. On this talk you'll have an opportunity to learn what are these tools and how to use them. Talk will consist of 4 parts:

Expand Down
2 changes: 1 addition & 1 deletion EDA/r_tools_for_automated_exploratory_data_analysis.md
Expand Up @@ -4,7 +4,7 @@ Author: Mateusz Staniak (Warsaw University of Technology)

Co-authors: Przemyslaw Biecek

# Descripition
# Description

Before a predictive model is built, the data set must needs to be well understood. This process is usually referred to as the Exploratory Data Analysis (EDA). In the era of countless easily available, but noisy and large data sets, automation of EDA is a task that could greatly speed up data analysis and aid non-experts who need to deal with data.
In this talk, I will describe many R packages for fast, automated EDA (autoEDA) with their strenghts and weaknesses. The talk is based on a paper "The Landscape of R Packages for Automated Exploratory Data Analysis" which was accepted to the R Journal.
Expand Down
2 changes: 1 addition & 1 deletion Geo/features_of_districts_of_warsaw_visible_from_space.md
Expand Up @@ -4,7 +4,7 @@ Author: Krystian Andruszek (Faculty of Economic Sciences, University of Warsaw,

Co-authors: Piotr Wójcik, Ewa Sobolewska

# Descripition
# Description

Daytime satellite images in high-resolution are commonly used to derive features of regions or smaller areas using convolutional neural networks (CNN). One can identify meaningful features like the number and density of buildings, the prevalence of shadow area as a proxy for building height, the number of cars, density and length of roads, type of farmland, roof material, etc. CNNs are a special kind of multilayer neural networks applied in image recognition. CNNs are identifying boundaries (edges), which separate areas of different colors. Based on low level concepts (a curve, a straight line) one can build more high level concepts (a square, circle, etc.) and even more abstract concepts. Training a neural network is a very demanding and time consuming process that requires powerful computational resources. One of solutions is transfer learning, in which the model is not trained from scratch, but uses some pre-trained model, that was trained before on a large benchmark dataset to solve a similar problem.

Expand Down
2 changes: 1 addition & 1 deletion Geo/geospatial_data_analysis_and_visualization_in_r.md
Expand Up @@ -2,7 +2,7 @@

Author: Çizmeli Servet Ahmet (PranaGEO LTD)

# Descripition
# Description


Analysis of geospatial data requires specialized software tools. The R ecosystem provides a rich set of powerful open source packages that make it possible to work with geospatial data. In this workshop we provide hands-on exercises with real datasets. We will learn how to import geospatial data in R, make interactive maps, convert between different formats and map projections. We will experiment on spatial queries and perform basic statistical analyses. This is a hands-on workshop so attendees are expected to bring a laptop.
Expand Down
Expand Up @@ -2,7 +2,7 @@

Author: Maria Mikos (University of Warsaw, Department of Economic Sciences)

# Descripition
# Description

Crucial part of spatial econometrics are weighting matrixes. However, spatial dependency is not the only relation, that can be adapted in this form. R package spdep:: provides a method to build own matrixes and convert them to listw class. Therefore, this function opens a possibility to utilize user-build objects in modeling. Filtering not only for geographical dependence, but also for heterogeneity of sample, they can significantly reduce the overbias of standard models. They can be used as an alternative for dummy-variable in OLS and exchange adjacency matrix in Spatial Durbin Model. Using Iris dataset and NUTS4 panel data two case-studies were presented. Categorical variable and machine learning results were used to uncover similarity of data. OLS modeling was augmented with self-made weighting matrixes and, as a result, lowest values of Information Criteria were obtained. The author stressed that weighting matrixes build on categorical data and clustering results can significantly improve econometrical estimation.

2 changes: 1 addition & 1 deletion Keynotes/Jakub_Nowosad.md
Expand Up @@ -6,4 +6,4 @@ Author: Jakub Nowosad (Adam Mickiewicz University, Poznan)

# Bio

Jakub is an assistant professor in the Department of Geoinformation at the Adam Mickiewicz University in Poznan, Poland. His main research is focused on developing and applying spatial methods in order to broaden our understanding of processes and patterns in the environment. He has extensive teaching experience in the fields of spatial analysis, geostatistics, statistics, and machine learning. Jakub is also an active member of the #rspatial community and a co-author of the Geocomputation with R book
Jakub is an assistant professor in the Department of Geoinformation at the Adam Mickiewicz University in Poznan, Poland. His main research is focused on developing and applying spatial methods in order to broaden our understanding of processes and patterns in the environment. He has extensive teaching experience in the fields of spatial analysis, geostatistics, statistics, and machine learning. Jakub is also an active member of the #rspatial community and a co-author of the Geocomputation with R book
2 changes: 1 addition & 1 deletion Keynotes/Marvin_Wright.md
Expand Up @@ -6,4 +6,4 @@ Author: Marvin Wright (Leibniz Institute for Prevention Research and Epidemiolog

# Bio

Marvin is a Postdoc at the Leibniz Institute for Prevention Research and Epidemiology in Bremen, Germany. He is the author of several R packages, including the random forest implementation ranger. He holds a Ph.D. in Biostatistics from the University of Lübeck, supervised by Andreas Ziegler. Previously, Marvin worked at the University of Lübeck, was a visiting researcher at the University of Copenhagen and also spent some time in the automotive industry and at health insurance. His main research interests are interpretable machine learning, genetic epidemiology and survival analysis.
Marvin is a Postdoc at the Leibniz Institute for Prevention Research and Epidemiology in Bremen, Germany. He is the author of several R packages, including the random forest implementation ranger. He holds a Ph.D. in Biostatistics from the University of Lübeck, supervised by Andreas Ziegler. Previously, Marvin worked at the University of Lübeck, was a visiting researcher at the University of Copenhagen and also spent some time in the automotive industry and at health insurance. His main research interests are interpretable machine learning, genetic epidemiology and survival analysis.
2 changes: 1 addition & 1 deletion Keynotes/Paula_Brito.md
Expand Up @@ -6,4 +6,4 @@ Author: Paula Brito (Faculty of Economics, University of Porto)

# Bio

Paula Brito is Associate Professor at the Faculty of Economics of the University of Porto, and member of the Artificial Intelligence and Decision Support Research Group (LIAAD) of INESC TEC, Portugal. She holds a doctorate degree in Applied Mathematics from the University Paris Dauphine, and an Habilitation in Applied Mathematics from the University of Porto. Her current research focuses on the analysis of multidimensional complex data, known as symbolic data, for which she develops statistical approaches and multivariate analysis methodologies. In this context, she has been involved in two European research projects. Paula Brito was president of the International Association for Statistical Computing (IASC) in 2013-2015. She has been invited speaker at several international conferences, is regularly member of internationalprogram committees, and has been chair of COMPSTAT 2008. Web-page: www.fep.up.pt/docentes/mpbrito
Paula Brito is Associate Professor at the Faculty of Economics of the University of Porto, and member of the Artificial Intelligence and Decision Support Research Group (LIAAD) of INESC TEC, Portugal. She holds a doctorate degree in Applied Mathematics from the University Paris Dauphine, and an Habilitation in Applied Mathematics from the University of Porto. Her current research focuses on the analysis of multidimensional complex data, known as symbolic data, for which she develops statistical approaches and multivariate analysis methodologies. In this context, she has been involved in two European research projects. Paula Brito was president of the International Association for Statistical Computing (IASC) in 2013-2015. She has been invited speaker at several international conferences, is regularly member of internationalprogram committees, and has been chair of COMPSTAT 2008. Web-page: www.fep.up.pt/docentes/mpbrito
2 changes: 1 addition & 1 deletion Keynotes/Sigrid_Keydana.md
Expand Up @@ -6,4 +6,4 @@ Author: Sigrid Keydana (RStudio)

# Bio

Sigrid is an Applied Researcher at RStudio. She has experience as a psychologist, software developer and data scientist. She is passionate about exploring the frontiers of deep learning and especially helping users employ the power of deep learning from R.
Sigrid is an Applied Researcher at RStudio. She has experience as a psychologist, software developer and data scientist. She is passionate about exploring the frontiers of deep learning and especially helping users employ the power of deep learning from R.
2 changes: 1 addition & 1 deletion Keynotes/Wit_Jakuczun.md
Expand Up @@ -6,4 +6,4 @@ Author: Wit Jakuczun (WLOG Solutions)

For many years software engineers have put enormous effort to develop best practices to deliver stable and maintainable software. How R users can benefit from this experience? I will try to answer this question going through several concepts and tools that are natural for software engineers but are often undervalued by R users.

I will start with a description of the deployment process because this is the ultimate step that exposes all weaknesses. You will learn about structuring R project, using abstractions to manage model’s features, automating models building process, optimizing the performance of the solution and the challenges of the deployment process itself.
I will start with a description of the deployment process because this is the ultimate step that exposes all weaknesses. You will learn about structuring R project, using abstractions to manage model’s features, automating models building process, optimizing the performance of the solution and the challenges of the deployment process itself.
Expand Up @@ -4,7 +4,7 @@ Author: Dominik Rafacz (Warsaw University of Technology)

Co-authors: Katarzyna Sidorczuk, Stefan Rodiger, Przemyslaw Gagat, Michal Burdukiewicz

# Descripition
# Description

Background: The advancements in various 'omics' fields have resulted in the discovery of many new protein sequences. Their functional annotations, however, come in at a much slower pace because they require laborious and often expensive experimental procedures. The machine learning models fill in this gap by providing estimates of protein functions. Although they do not replace the experiments, the in silico methods undoubtedly help scientists to understand the ever-growing protein datasets. The challenges of developing appropriate models for protein data exclude from the field scientists with limited machine learning expertise and resources. Therefore, we propose autobiograML, an R package designed to automatically apply our framework for protein function prediction [1, 2].
Methods: autobiograML models the relationships between provided protein sequences (encoded as amino acid motifs) and annotations. The Bayesian framework optimizes the hyperparameters of the model in nested cross-validation. The outer layer of the cross-validation is later used to select the optimal machine learning algorithm. Our software produces not only a model but also a list of important motifs for further studies. Moreover, autobiograML generates Shiny web servers that might be later distributed between less R-savvy users.
Expand Down
Expand Up @@ -4,7 +4,7 @@ Author: Anne Bras (Erasmus University, the Netherlands)

Co-authors: Vincent van der Velden

# Descripition
# Description

Inder Taneja (an Indian mathematician) attempted to write the integers from 1 up to 11111 in terms of 1 to 9 (in increasing and decreasing order) by using addition, subtraction, multiplication, division, exponentiation, parenthesis and/or digit concatenation. For example:

Expand Down
Expand Up @@ -2,7 +2,7 @@

Author: Hubert Baniecki (MI2 MiNI PW)

# Descripition
# Description

Hubert Baniecki

Expand Down

0 comments on commit 8d975d0

Please sign in to comment.