Skip to content

Latest commit

 

History

History
95 lines (66 loc) · 7.44 KB

presentations.md

File metadata and controls

95 lines (66 loc) · 7.44 KB

Presentations

2024

How to automate dependency updates with the Renovate bot (DevOps Pro Europe 2024)

Small Language Models - running a ChatGPT equivalent on your own laptop (Lubelskie Dni Informatyki 2024)

2023

Building a Q&A engine with LangChain and open-source LLMs (DevAI by Data Science Summit 2023)

Building a Q&A engine with LangChain and open-source LLMs (ZS BIT 1001)

2019

Do androids read about electric sheep? Machine reading comprehension algorithms - revised and extended version, presented at the Devoxx UA 2019

Do androids read about electric sheep? Machine reading comprehension algorithms (materials; presented at the Code4Life Conference and Code4Life Meetup in Poznan)

It has been almost 10 years since Watson beat human contestants at Jeopardy. Computers are better than our chess or Go grandmasters. They fly rockets in space and drive cars. But they still cannot read and understand books. Why is it so hard and how far are computer algorithms from human-level natural language understanding?

2017

Natural language processing: IT meets grammar (Code4Life Conference, Data Science Summit, Code4Life Meetup)

Business Intelligence is not only about numbers and figures. Unstructured text data, like patient records or product reviews, contain valuable insights, but extracting them is not trivial. By presenting two use cases from Roche Global IT Solutions, we will show you the basic NLP methods and prove that it may be time to brush up on your grammar knowledge.

Natural language processing at Roche (Code4Life Scientific)

Sometimes documents contain information valuable for different teams, but at the same time they may contain sensitive data and - as a precaution - only a small group of people can access them. Manual review and redaction is usually not feasible due to the amount of documents. Named entity recognition (NER) allows to locate different types of entities and selectively mask them. The upcoming GDPR regulation will only increase the demand for efficient text data de-identification.

Full-text search engines can provide more relevant results, if the indexing process understands the semantic relations between terms in a document. Patient medical records very often contain both symptoms and negative finding. Negation detection allows to distinguish the relevant documents from the ones that merely contain some term.

Conversational user interfaces (CUI) like chatbots can provide an additional medium to communicate with the customers. They promise automated communication solution that is familiar to the users, personalized and context-sensitive. At the same time, the process training and tuning the chatbot's model is challenging, because natural language understanding (NLU) engines offered by the vendors are black boxes.

This presentation gives an overview of different solutions developed at Roche Global IT Solutions that employ NLP techniques, with a special focus on encountered challenges and unsolved problems.

Data warehousing on Hadoop - DOs and DON'Ts (Code4Life Meetup)

The shortest intro to Big Data (IT for SHE - Tech Camp for students)

What does a developer do in a healthcare company? (IT for SHE - presentation for children)

2016

Data warehousing on Hadoop - DOs and DON'Ts (slide deck + recording; presented at the Code4Life Conference and PAZUR meetup)

  • Extended version of the Devoxx presentation

Data warehousing on Hadoop - One important DON'T and a few DOs (slide deck; presented at Devoxx PL 2016)

At the same time, you can have your lunch and learn why you should attend my evening BOF session. Does it sound like a good deal?

The brave new world of Big Data has been around for a while and its tools have been successfully applied to solve different problems. But is it a silver bullet? Is it really completely new? Can you forget the old truths of design, architecture and project management?

The goal of the StraDa project is to gather the log files from hundreds of Roche diagnostic instruments spread around the world and transform these TBs of data into a data warehouse.

We built it, but it wasn't a straightforward task.

Please come to learn what was the biggest mistake we made and then come back at 19:30 for a full-blown session.

Data warehousing on Hadoop - After a few months in production (slide deck; presented at Devoxx PL 2016)

Medical laboratory instruments produce immense volumes of log files. Until recently, they were used only for trouble shooting and maintenance purposes. However, hidden inside are insights that could allow the laboratory managers to streamline and optimize the diagnostic process.

The goal of the StraDa project is to gather the log files from hundreds of Roche diagnostic instruments spread around the world, transform these TBs of data into actionable information and make it available for the business users.

Yet another data warehouse? Sounds reasonably easy? Well, so we thought. And we were wrong.

Please come to learn about some of the mistakes we made and problems we encountered, so you can avoid them.

I will have slides, but I don’t need to follow them. Ask questions! Challenge me! Share you own experience!

Data warehousing on Hadoop - 7 months later (slide deck; presented at the PAS seminar)

2015

Data warehousing on Hadoop (slide deck; presented at the DAC4B conference)