Skip to content

Convert your Telegram chat data from JSON into easy-to-use C# objects. Useful for text extraction and data analysis.

License

Notifications You must be signed in to change notification settings

kylejsarte/TelechatSharp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TelechatSharp

NuGet Version NuGet Downloads (manually updated as shields.io sometimes fails to fetch) GitHub License

TelechatSharp handles the deserialization of JSON exported from Telegram Desktop, making it easy to work with your chat data in .NET applications. Useful for text extraction and data analysis.

TelechatSharp is available on NuGet.

dotnet add package TelechatSharp

Accessing Your Data

Construct a new Chat object using the file path to your JSON:

Chat chat = new Chat("telegram.json");

Easily extract meaningful data from your chat history:

IEnumerable<Message> messages = chat.Messages;

IEnumerable<Message> messagesFromKyle = messages.FromMember("Kyle Sarte");

IEnumerable<Message> messagesContainingLol = messages.ContainingText("lol");

IEnumerable<string> allEmailsSentInChat = messages.GetAllTextsOfTextEntityType("email");

IEnumerable<string> allPhoneNumbersSentInChat = messages.GetAllTextsOfTextEntityType("phone");

Advanced Use Cases

For advanced data analysis use cases, combine TelechatSharp with libraries such as Microsoft.Data.Analysis:

using TelechatSharp.Core;
using Microsoft.Data.Analysis;

Chat chat = new Chat("telegram.json");

// From TelechatSharp.Core, get chat member names and their messages.
IEnumerable<string> from = chat.Messages.Select(m => m.From);
IEnumerable<string> text = chat.Messages.Select(m => m.Text);

// From Microsoft.Data.Analysis, create a new DataFrame using data from TelechatSharp.
DataFrameColumn[] columns = {
   new StringDataFrameColumn("From", from),
   new StringDataFrameColumn("Text", text)
};

DataFrame dataFrame = new(columns);

Code produces a DataFrame similar to the following:

From Text
Céline You are gonna miss that plane.
Jesse I know.

Alternative Construction

A Chat object can also be instantiated using a StreamReader:

public Chat(StreamReader streamReader)

Further Reading

For readability, JSON samples in this README will omit most properties and only include those relevant to the topic being discussed.

Chat.cs

While the library is most useful through a Chat object's Messages property, some derived data about a chat can be accessed via extension methods:

ChatExtensions.cs

var dateCreated = chat.GetDateCreated();

var members = chat.GetMembers();

var originalMembers = chat.GetOriginalMembers();

Message.cs

Data about individual messages can be accessed through properties of the custom Message object:

{
   "messages": [
      {
         "type": "message",
	 "date": "2024-01-27T19:40:00",
	 "from": "Kyle Sarte",
         "text": "TelechatSharp is public!"
      }
   ]
}
foreach (Message message in chat.Messages)
{
  Console.WriteLine($"On {message.Date}, {message.From} said '{message.Text}'");
}

Output is:

On 1/27/2024 7:40:00 PM, Kyle Sarte said 'TelechatSharp is public!'

Message Types

In the schema, Telegram messages are either Message types or Service types. A Message is any basic text message. A Service is any action performed on the chat such as the pinning of a message or the invitation of a new member:

{
   "messages": [
      {
         "id": 1,
         "type": "message",
         "text": "Someone please change the group photo."
      },
      {
         "id": 2,
         "type": "service",
         "action": "edit_group_photo"
      }
   ]
}

Messages can be filtered by type through extension methods:

var messageTypeMessages = chat.Messages.GetMessageTypeMessages();
var serviceTypeMessages = chat.Messages.GetServiceTypeMessages();

MessagesExtensions.cs

For a full list of available extension methods on Message collections, refer to MessagesExtensions.cs.

Full API documentation coming soon.

Text.cs & TextEntity.cs

Message.Text and Message.TextEntities can be used to work with text data.

Text.cs

Use Message.Text to get a plain text string of a message's text content.

The text property of a message in the JSON is polymorphic—text can either be a plain text string or an array of plain text strings and text entity objects. Message.Text is backed by a private field of custom type Text which handles any necessary string building to abstract these text entities away and simplify text retrieval.

{
   "messages": [
      {
         "type": "message",
         "text": "This is a message with only plain text."
      },
      {
         "type": "message",
         "text": [
            "This is a message with plain text and ",
            {
               "type": "bold",
               "text": "bold text."
            },
            " The bold text appears in the JSON as a text entity. Links, such as ",
            {
               "type": "link",
               "text": "https://www.kylejsarte.com"
            },
            " will also appear as text entities."
         ]
      }
   ]
}

A call to Message.Text for the second message would build and return the following plain text string:

This is a message with plain text and bold text. The bold text appears in the JSON as a text entity. Links such as https://www.kylejsarte.com will also appear as text entities.

TextEntity.cs

Use Message.TextEntities for finer-grain access to text data.

Message.TextEntities, unlike Message.Text, preserves the structure of objects returned from the text_entities property of messages in the JSON. Message.TextEntities returns a collection of custom TextEntity types with Type and Text properties:

{
   "messages": [
      {
         "id": 1,
         "type": "message",
         "text_entities": [
            {
               "type": "link",
               "text": "https://www.kylejsarte.com"
            },
            {
               "type": "mention",
               "text": "Kyle Sarte"
            }
         ]
      }
   ]
}
// Get the message with ID "1".
var message = chat.Messages.Where(m => m.Id == 1).FirstOrDefault();

foreach (TextEntity textEntity in message.TextEntities)
{
    Console.WriteLine($"Type: {textEntity.Type}, Text: {textEntity.Text}");
}

Output is:

Type: link, Text: https://www.kylejsarte.com

Type: mention, Text: Kyle Sarte