# Getting Unalive Data
This notebook will get my second round of data utilizing the code from the [getting_bsky_data](https://github.com/Data-Science-for-Linguists-2025/Algospeak-on-Bluesky/blob/main/getting_bsky_data.ipynb) notebook.

## Logging into the API

In [2]:
from atproto import Client, client_utils
import pandas as pd

fname = r'private\pw.txt'
file = open(fname, 'r')
pw = file.read()
file.close()

client = Client()
client.login('sararosenauling.bsky.social', pw)

ProfileViewDetailed(did='did:plc:66lizgbz577cdzpwnfyox3g6', handle='sararosenauling.bsky.social', associated=ProfileAssociated(chat=None, feedgens=0, labeler=False, lists=0, starter_packs=0, py_type='app.bsky.actor.defs#profileAssociated'), avatar='https://cdn.bsky.app/img/avatar/plain/did:plc:66lizgbz577cdzpwnfyox3g6/bafkreicddmiczeju5vfi5yxgp2oosshkakweumupixgkofwwkq6jp3nqiq@jpeg', banner='https://cdn.bsky.app/img/banner/plain/did:plc:66lizgbz577cdzpwnfyox3g6/bafkreih3qtqbp27r3sq6jskcn4wmb7oxmbbbt6p3lcmhd5zbqi6ww5k5pm@jpeg', created_at='2023-06-25T12:38:14.621Z', description='she/her - Sociolinguist studying the internet at Pitt! Also into kpop/kdrama, musicals, the Phillies, cross stiching and crocheting, and singing in general\n', display_name='Sara Rosenau', followers_count=939, follows_count=1067, indexed_at='2024-11-11T20:50:23.741Z', joined_via_starter_pack=None, labels=[], pinned_post=None, posts_count=321, viewer=ViewerState(blocked_by=False, blocking=None, blocking_by_list=N

## Figuring out how to get connected posts and other metadata

In [3]:
results = client.app.bsky.feed.search_posts({'q': 'unalive', 'limit': 30, 'sort': 'top'})
results_dict = results.model_dump()

In [6]:
results_dict['posts'][2]

{'author': {'did': 'did:plc:6bzr3x2js7436xj46zfi5xxo',
  'handle': 'shinyquagsire.bowser.gay',
  'associated': {'chat': {'allow_incoming': 'all',
    'py_type': 'app.bsky.actor.defs#profileAssociatedChat'},
   'feedgens': None,
   'labeler': None,
   'lists': None,
   'starter_packs': None,
   'py_type': 'app.bsky.actor.defs#profileAssociated'},
  'avatar': 'https://cdn.bsky.app/img/avatar/plain/did:plc:6bzr3x2js7436xj46zfi5xxo/bafkreib2s2qnzbrvugmrb65lzgcgpz4a76f5r45gpqw35vuxkcy4xy4h24@jpeg',
  'created_at': '2023-06-27T17:54:16.574Z',
  'display_name': 'Shiny Quagsire',
  'labels': [],
  'viewer': {'blocked_by': False,
   'blocking': None,
   'blocking_by_list': None,
   'followed_by': None,
   'following': None,
   'known_followers': None,
   'muted': False,
   'muted_by_list': None,
   'py_type': 'app.bsky.actor.defs#viewerState'},
  'py_type': 'app.bsky.actor.defs#profileViewBasic'},
 'cid': 'bafyreidcnzoubgojjvsu3wzoa5ebjwjhswzzubdyyylw4yr5aackieytmm',
 'indexed_at': '2025-04-17T

If a post is a reply to something, 'reply' under the 'record' is not None. 

### Converting URIs to URLs

As explained in [this post](https://github.com/bluesky-social/atproto/discussions/2523)

In [4]:
import re
u = re.split('/', 'at://did:plc:awzzrtrcrvpnxi3ph2sbhxwv/app.bsky.feed.post/3lmt2lhdlwk2l')
u

['at:',
 '',
 'did:plc:awzzrtrcrvpnxi3ph2sbhxwv',
 'app.bsky.feed.post',
 '3lmt2lhdlwk2l']

In [5]:
'https://bsky.app/profile/'+u[2]+'/post/'+u[4]

'https://bsky.app/profile/did:plc:awzzrtrcrvpnxi3ph2sbhxwv/post/3lmt2lhdlwk2l'

In [6]:
def uri_to_url(uri):
    u = re.split('/', uri)
    url = 'https://bsky.app/profile/'+u[2]+'/post/'+u[4]
    return url

In [7]:
uri_to_url('at://did:plc:g7cu7736qmemcopvjip74g3b/app.bsky.feed.post/3lmt2wyd5q22a')

'https://bsky.app/profile/did:plc:g7cu7736qmemcopvjip74g3b/post/3lmt2wyd5q22a'

## Getting Posts
I can only get 100 posts at a time unfortunately...
I'll just run it again on a different time frame

In [33]:
def search2df_top(query, since, until):
    results = client.app.bsky.feed.search_posts({'q': query, 'limit': 100, 'sort': 'top', 'since': since, 'until': until})
    results_dict = results.model_dump()
    query_data = []
    for post in results_dict['posts']:
        metadata = {}
        metadata['text'] = post['record']['text']
        metadata['author'] = post['author']['handle']
        metadata['display_name'] = post['author']['display_name']
        metadata['date'] = post['record']['created_at']
        metadata['likes'] = post['like_count']
        metadata['quotes'] = post['quote_count']
        metadata['replies'] = post['reply_count']
        metadata['reposts'] = post['repost_count']
        metadata['uri'] = post['uri']
        metadata['url'] = uri_to_url(post['uri'])

        if post['record']['reply'] is not None:
            metadata['reply_to'] = 'Yes'
            metadata['reply_to_uri'] = post['record']['reply']['parent']['uri']
            metadata['reply_to_url'] = uri_to_url(post['record']['reply']['parent']['uri'])
        else:
            metadata['reply_to'] = 'No'
            metadata['reply_to_uri'] = None
            metadata['reply_to_url'] = None
            
        metadata['query'] = query
        query_data.append(metadata)
    query_df = pd.DataFrame(query_data)
    return query_df

In [34]:
df = search2df_top('unalive', None, None)

In [35]:
pd.set_option('display.max_columns', 100) # not working for some reason :/
df.head()

Unnamed: 0,text,author,display_name,date,likes,quotes,replies,reposts,uri,url,reply_to,reply_to_uri,reply_to_url,query
0,unalive,ninochuu.zip,nino,2025-04-12T20:19:42.329Z,154,0,0,29,at://did:plc:75rnvxji6taq2xyxq63ejhmm/app.bsky...,https://bsky.app/profile/did:plc:75rnvxji6taq2...,No,,,unalive
1,We should unalive Caesar?,doktorslek.bsky.social,Doktor Slek ðŸ¦˜â˜­,2025-04-15T02:50:06.393Z,10,0,0,0,at://did:plc:g7cu7736qmemcopvjip74g3b/app.bsky...,https://bsky.app/profile/did:plc:g7cu7736qmemc...,Yes,at://did:plc:awzzrtrcrvpnxi3ph2sbhxwv/app.bsky...,https://bsky.app/profile/did:plc:awzzrtrcrvpnx...,unalive
2,You ever chill so hard that you appear unalive??,laminatedliss.bsky.social,Churroâ€™s Mom,2025-04-11T15:57:19.608Z,40,1,7,1,at://did:plc:kvyauj6dn35wre6meqyg6ync/app.bsky...,https://bsky.app/profile/did:plc:kvyauj6dn35wr...,No,,,unalive
3,Musk can unalive any of us and cut off all mea...,iputadollarin.bsky.social,Iputadollariniwinacar,2025-04-12T23:12:34.756Z,7,0,2,0,at://did:plc:icoabg5urmotmxr3cko3qfsi/app.bsky...,https://bsky.app/profile/did:plc:icoabg5urmotm...,Yes,at://did:plc:q3bbdtxch45wvfxrxpblphxn/app.bsky...,https://bsky.app/profile/did:plc:q3bbdtxch45wv...,unalive
4,That happened the moment we let these companie...,jayej330.bsky.social,"Jayleigh ""Jaye"" Jimenez",2025-04-14T08:57:07.765Z,29,0,1,0,at://did:plc:r6coggh5qqmqrqvrpitb7khu/app.bsky...,https://bsky.app/profile/did:plc:r6coggh5qqmqr...,Yes,at://did:plc:uqppyrcon566pkrszusjonav/app.bsky...,https://bsky.app/profile/did:plc:uqppyrcon566p...,unalive


In [36]:
df[['reply_to', 'reply_to_url']][:10] # seems to work!

Unnamed: 0,reply_to,reply_to_url
0,No,
1,Yes,https://bsky.app/profile/did:plc:awzzrtrcrvpnx...
2,No,
3,Yes,https://bsky.app/profile/did:plc:q3bbdtxch45wv...
4,Yes,https://bsky.app/profile/did:plc:uqppyrcon566p...
5,Yes,https://bsky.app/profile/did:plc:5foqpamtsdc6f...
6,Yes,https://bsky.app/profile/did:plc:ca5og5dzdlmoe...
7,Yes,https://bsky.app/profile/did:plc:vrps533rrs2nb...
8,No,
9,Yes,https://bsky.app/profile/did:plc:zrq6g77ftuyle...


In [46]:
# finding the timeframe of this set
date = df[['date']]
date

Unnamed: 0,date
0,2025-04-12T20:19:42.329Z
1,2025-04-15T02:50:06.393Z
2,2025-04-11T15:57:19.608Z
3,2025-04-12T23:12:34.756Z
4,2025-04-14T08:57:07.765Z
...,...
94,2025-04-04T02:58:38.927Z
95,2025-04-03T03:52:44.059Z
96,2025-04-04T15:56:15.260Z
97,2025-04-03T22:04:30.308Z


In [45]:
date.sort_values(by=['date'])

Unnamed: 0,date
98,2025-04-02T12:08:12.498Z
95,2025-04-03T03:52:44.059Z
97,2025-04-03T22:04:30.308Z
91,2025-04-04T00:23:33Z
94,2025-04-04T02:58:38.927Z
...,...
41,2025-04-15T04:03:45.656Z
9,2025-04-15T04:59:30.134Z
35,2025-04-15T06:27:04.170Z
29,2025-04-15T14:30:39.330Z


In [63]:
df2 = search2df_top('unalive', None ,'2025-03-31T00:00:00.000Z')
df2.head()

Unnamed: 0,text,author,display_name,date,likes,quotes,replies,reposts,uri,url,reply_to,reply_to_uri,reply_to_url,query
0,"new ""unalive"" euphemism just dropped",joadsprocket.bsky.social,Joad The Wet Sprocket,2025-03-30T23:50:12.319Z,28,0,0,0,at://did:plc:3nqgwlbyx4i6j553jilcb2lo/app.bsky...,https://bsky.app/profile/did:plc:3nqgwlbyx4i6j...,No,,,unalive
1,Molly has every right to unalive Kristina by now.,leslieb68.bsky.social,Leslie,2025-03-30T15:48:54.259Z,3,0,1,0,at://did:plc:tnip6m3flxs2mrlt5bqlc5kd/app.bsky...,https://bsky.app/profile/did:plc:tnip6m3flxs2m...,Yes,at://did:plc:itooo5oj5hr255oyvszbboh7/app.bsky...,https://bsky.app/profile/did:plc:itooo5oj5hr25...,unalive
2,Because in Dictatorships itâ€™s a waste of money...,debsmith1647.bsky.social,,2025-03-30T03:14:18.635Z,29,0,1,1,at://did:plc:stjjhxmg7igobpbyyiy6kt65/app.bsky...,https://bsky.app/profile/did:plc:stjjhxmg7igob...,Yes,at://did:plc:y5xyloyy7s4a2bwfeimj7r3b/app.bsky...,https://bsky.app/profile/did:plc:y5xyloyy7s4a2...,unalive
3,"can we extend this to ""unalive"" also?",yodoops.bsky.social,doops.,2025-03-30T16:29:52.359Z,4,0,1,0,at://did:plc:heyj4lbfp3znjle5kjoxa3xv/app.bsky...,https://bsky.app/profile/did:plc:heyj4lbfp3znj...,Yes,at://did:plc:h3y3f4pmwha4pqzekjpbjg4s/app.bsky...,https://bsky.app/profile/did:plc:h3y3f4pmwha4p...,unalive
4,"Tough stuff, I think if an ice agent tried sna...",420buddha.bsky.social,420buddha,2025-03-30T03:33:24.896Z,11,0,2,0,at://did:plc:n2mdj65y3l2owpgofxjm7mrh/app.bsky...,https://bsky.app/profile/did:plc:n2mdj65y3l2ow...,Yes,at://did:plc:uo2fna47c4v6zcnklxfhcvjb/app.bsky...,https://bsky.app/profile/did:plc:uo2fna47c4v6z...,unalive


In [70]:
date2 = df2[['date']]
date2.sort_values(by=['date']) # discrete time sequences!

Unnamed: 0,date
96,2025-03-16T15:28:42.737Z
98,2025-03-16T16:44:12.457Z
97,2025-03-17T01:17:05.026Z
90,2025-03-17T05:22:54.143Z
79,2025-03-17T09:27:02.579Z
...,...
9,2025-03-30T13:50:11.523Z
1,2025-03-30T15:48:54.259Z
3,2025-03-30T16:29:52.359Z
6,2025-03-30T16:37:32.374Z


In [71]:
unalive_top_df = pd.concat([df, df2])

In [72]:
unalive_top_df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 198 entries, 0 to 98
Data columns (total 14 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   text          198 non-null    object
 1   author        198 non-null    object
 2   display_name  198 non-null    object
 3   date          198 non-null    object
 4   likes         198 non-null    int64 
 5   quotes        198 non-null    int64 
 6   replies       198 non-null    int64 
 7   reposts       198 non-null    int64 
 8   uri           198 non-null    object
 9   url           198 non-null    object
 10  reply_to      198 non-null    object
 11  reply_to_uri  100 non-null    object
 12  reply_to_url  100 non-null    object
 13  query         198 non-null    object
dtypes: int64(4), object(10)
memory usage: 23.2+ KB


In [73]:
unalive_top_df.tail()

Unnamed: 0,text,author,display_name,date,likes,quotes,replies,reposts,uri,url,reply_to,reply_to_uri,reply_to_url,query
94,Biodome2 failed. Looks like we should be tryin...,dawnwilliamson.bsky.social,,2025-03-17T10:29:55.416Z,6,0,0,0,at://did:plc:7aqvrita4kqxrcjwgw45opip/app.bsky...,https://bsky.app/profile/did:plc:7aqvrita4kqxr...,Yes,at://did:plc:ux34natbhxube3xgdu3rhf45/app.bsky...,https://bsky.app/profile/did:plc:ux34natbhxube...,unalive
95,THEY'RE UNALIVING HER...\n\nTHEN THEY'RE GONNA...,beanycatte.bsky.social,The noodle Î¸âˆ†,2025-03-17T09:37:11.508Z,6,0,2,1,at://did:plc:3u7o6h2rx7elcbipmuwmvmrz/app.bsky...,https://bsky.app/profile/did:plc:3u7o6h2rx7elc...,No,,,unalive
96,PSA: starting tonight I'll be on a two-week so...,amiberger.bsky.social,Ami Berger,2025-03-16T15:28:42.737Z,153,0,9,0,at://did:plc:fhuifhwlgpfh233on5jrmxrl/app.bsky...,https://bsky.app/profile/did:plc:fhuifhwlgpfh2...,No,,,unalive
97,"If Labour cuts Â£675 a month ,I'll save the the...",emmadm101.bsky.social,ðŸ‡ºðŸ‡¦emmadm101 ðŸ‡µðŸ‡¸,2025-03-17T01:17:05.026Z,21,0,0,5,at://did:plc:ihdaraxh2s4vwlmw2yign5au/app.bsky...,https://bsky.app/profile/did:plc:ihdaraxh2s4vw...,Yes,at://did:plc:vovinwhtulbsx4mwfw26r5ni/app.bsky...,https://bsky.app/profile/did:plc:vovinwhtulbsx...,unalive
98,i think only trans people should be allowed to...,livinginjeopardy.bsky.social,transsexual anarchy cringe CEO ðŸ”†ðŸ¥¦,2025-03-16T16:44:12.457Z,7,0,0,1,at://did:plc:bucg2dsfr66ecm7kc3bmepwv/app.bsky...,https://bsky.app/profile/did:plc:bucg2dsfr66ec...,No,,,unalive


In [75]:
#unalive_top_df.to_csv('unalive_top_posts.csv', index=False)