Trying to retrieve original file path names in the results #4

Sid01123 · 2022-06-06T12:53:33Z

Is there a way to be able to get the original pathnames of images used post fit_transform?

I am uploading images onto google colab, and reading them in by their filepaths as "/content/name_of_image", and then I wish to be able to recover this "/content/name_of_image" post running clustering.

I tried to extract pathnames per label using the following code, but seemed to be getting the filepaths for images created in a temporary directory as follows:

CODE
Iloc = cl.results['labels']==0
cl.results['pathnames'][Iloc]

OUTPUT
array(['/tmp/clustimage/8732cb41-c72d-4266-b164-ff453d68428a.png',
'/tmp/clustimage/440fecd8-8a9c-49a0-b100-ccfb66107425.png',
'/tmp/clustimage/3c9c38d8-4da9-4e4f-9130-d3836182b8c6.png',
'/tmp/clustimage/85cc4848-1faf-44ea-ae4c-9d9d88bd6323.png',
'/tmp/clustimage/6127e4fb-1c25-4ba9-8d68-56ef482e3db4.png',
'/tmp/clustimage/abcf85e0-af1a-48f1-8861-122122b64e32.png',
'/tmp/clustimage/275bbde0-394d-4ba4-b4d0-1c67da323c8b.png',
'/tmp/clustimage/30b62285-2628-45c0-86b2-fea305cb8db3.png',
'/tmp/clustimage/c47a6867-3c8f-480c-a7bd-b3e7ec4ba334.png',
'/tmp/clustimage/da5c17fc-de2a-4375-b03c-066a0904428a.png'], dtype='<U56')

I wish to get the output as the original filenames that were in the pathnames list.

erdogant · 2022-06-09T13:49:16Z

Can you show with an example how this occurs?
When I try the flowers example, it stores the filenames and paths correctly.
The unique identifiers are only used if a data matrix is given as an input.

from clustimage import Clustimage
cl = Clustimage(method='pca', embedding
g='umap')
# Import data
Xlist = cl.import_example(data='flowers')
# Import data in a standardized manner
X = cl.import_data(Xlist)

X.keys()
dict_keys(['img', 'feat', 'xycoord', 'pathnames', 'labels', 'url', 'filenames'])
print(X['filenames'][0:5])
# array(['0001.png', '0002.png', '0003.png', '0004.png', '0005.png'],

What I can do for the datamatrix, is use the index names of a pandas dataframe for naming. In that way you can control the naming as you wish.

erdogant · 2022-06-09T14:25:04Z

I added this functionality the functionality to read pandas dataframes.
Update with: pip install -U clustimage

Example:

from clustimage import Clustimage
import pandas as pd
import numpy as np

# Initialize
cl = Clustimage()

# Import data
Xraw = cl.import_example(data='mnist')

print(Xraw)
# array([[ 0.,  0.,  5., ...,  0.,  0.,  0.],
#        [ 0.,  0.,  0., ..., 10.,  0.,  0.],
#        [ 0.,  0.,  0., ..., 16.,  9.,  0.],
#        ...,
#        [ 0.,  0.,  1., ...,  6.,  0.,  0.],
#        [ 0.,  0.,  2., ..., 12.,  0.,  0.],
#        [ 0.,  0., 10., ..., 12.,  1.,  0.]])

filenames = list(map(lambda x: str(x) + '.png', np.arange(0, Xraw.shape[0])))
Xraw = pd.DataFrame(Xraw, index=filenames)

print(Xraw)
#            0    1     2     3     4     5   ...   58    59    60    61   62   63
# 0.png     0.0  0.0   5.0  13.0   9.0   1.0  ...  6.0  13.0  10.0   0.0  0.0  0.0
# 1.png     0.0  0.0   0.0  12.0  13.0   5.0  ...  0.0  11.0  16.0  10.0  0.0  0.0
# 2.png     0.0  0.0   0.0   4.0  15.0  12.0  ...  0.0   3.0  11.0  16.0  9.0  0.0
# 3.png     0.0  0.0   7.0  15.0  13.0   1.0  ...  7.0  13.0  13.0   9.0  0.0  0.0
# 4.png     0.0  0.0   0.0   1.0  11.0   0.0  ...  0.0   2.0  16.0   4.0  0.0  0.0
#       ...  ...   ...   ...   ...   ...  ...  ...   ...   ...   ...  ...  ...
# 1792.png  0.0  0.0   4.0  10.0  13.0   6.0  ...  2.0  14.0  15.0   9.0  0.0  0.0
# 1793.png  0.0  0.0   6.0  16.0  13.0  11.0  ...  6.0  16.0  14.0   6.0  0.0  0.0
# 1794.png  0.0  0.0   1.0  11.0  15.0   1.0  ...  2.0   9.0  13.0   6.0  0.0  0.0
# 1795.png  0.0  0.0   2.0  10.0   7.0   0.0  ...  5.0  12.0  16.0  12.0  0.0  0.0
# 1796.png  0.0  0.0  10.0  14.0   8.0   1.0  ...  8.0  12.0  14.0  12.0  1.0  0.0

# Fit and transform data
results = cl.fit_transform(Xraw)

print(results['filenames'])
# array(['0.png', '1.png', '2.png', ..., '1794.png', '1795.png', '1796.png'],

Sid01123 · 2022-06-09T14:29:50Z

Dear Mr. Taskesen, Thank you so much for taking the time to add this update. I shall try It out and let you know! Best, Sid

…

On Thu, Jun 9, 2022 at 10:25 AM Erdogan Taskesen ***@***.***> wrote: I added this functionality the functionality to read pandas dataframes. Update with: pip install -U clustimage Example: from clustimage import Clustimage import pandas as pd import numpy as np # Initialize cl = Clustimage() # Import data Xraw = cl.import_example(data='mnist') print(Xraw) # array([[ 0., 0., 5., ..., 0., 0., 0.], # [ 0., 0., 0., ..., 10., 0., 0.], # [ 0., 0., 0., ..., 16., 9., 0.], # ..., # [ 0., 0., 1., ..., 6., 0., 0.], # [ 0., 0., 2., ..., 12., 0., 0.], # [ 0., 0., 10., ..., 12., 1., 0.]]) filenames = list(map(lambda x: str(x) + '.png', np.arange(0, Xraw.shape[0]))) Xraw = pd.DataFrame(Xraw, index=filenames) print(Xraw) # 0 1 2 3 4 5 ... 58 59 60 61 62 63 # 0.png 0.0 0.0 5.0 13.0 9.0 1.0 ... 6.0 13.0 10.0 0.0 0.0 0.0 # 1.png 0.0 0.0 0.0 12.0 13.0 5.0 ... 0.0 11.0 16.0 10.0 0.0 0.0 # 2.png 0.0 0.0 0.0 4.0 15.0 12.0 ... 0.0 3.0 11.0 16.0 9.0 0.0 # 3.png 0.0 0.0 7.0 15.0 13.0 1.0 ... 7.0 13.0 13.0 9.0 0.0 0.0 # 4.png 0.0 0.0 0.0 1.0 11.0 0.0 ... 0.0 2.0 16.0 4.0 0.0 0.0 # ... ... ... ... ... ... ... ... ... ... ... ... ... # 1792.png 0.0 0.0 4.0 10.0 13.0 6.0 ... 2.0 14.0 15.0 9.0 0.0 0.0 # 1793.png 0.0 0.0 6.0 16.0 13.0 11.0 ... 6.0 16.0 14.0 6.0 0.0 0.0 # 1794.png 0.0 0.0 1.0 11.0 15.0 1.0 ... 2.0 9.0 13.0 6.0 0.0 0.0 # 1795.png 0.0 0.0 2.0 10.0 7.0 0.0 ... 5.0 12.0 16.0 12.0 0.0 0.0 # 1796.png 0.0 0.0 10.0 14.0 8.0 1.0 ... 8.0 12.0 14.0 12.0 1.0 0.0 # Or all in one run results = cl.fit_transform(Xraw) print(results['filenames']) # array(['0.png', '1.png', '2.png', ..., '1794.png', '1795.png', '1796.png'], — Reply to this email directly, view it on GitHub <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_erdogant_clustimage_issues_4-23issuecomment-2D1151187795&d=DwMCaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=D6W32UT11SWv3cCY-ZP9mPTas-ek59iSXpK9UYl1RaY&m=WhDAnk5ke6NJMlrh-fffAcVEv32VK7AnHw873swd82hTuVd5GgiwyBYA6JAZCljF&s=pGUi3Pn24RqdnaS_Vx33Xr3riARemSvRd6cmhTsr5d8&e=>, or unsubscribe <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ATKYKE3OLHGWYLY2I4PEXPDVOH5EXANCNFSM5X7JH7RA&d=DwMCaQ&c=009klHSCxuh5AI1vNQzSO0KGjl4nbi2Q0M1QLJX9BeE&r=D6W32UT11SWv3cCY-ZP9mPTas-ek59iSXpK9UYl1RaY&m=WhDAnk5ke6NJMlrh-fffAcVEv32VK7AnHw873swd82hTuVd5GgiwyBYA6JAZCljF&s=kB6g7cPrLfl3WXv-2a14KQQDB8wPwkaiEuDgbeueI0s&e=> . You are receiving this because you authored the thread.Message ID: ***@***.***>

erdogant · 2022-08-23T11:12:22Z

I am closing this one. Re-open this issue if required.

erdogant closed this as completed Aug 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to retrieve original file path names in the results #4

Trying to retrieve original file path names in the results #4

Sid01123 commented Jun 6, 2022 •

edited

erdogant commented Jun 9, 2022 •

edited

erdogant commented Jun 9, 2022 •

edited

Sid01123 commented Jun 9, 2022 via email

erdogant commented Aug 23, 2022

Trying to retrieve original file path names in the results #4

Trying to retrieve original file path names in the results #4

Comments

Sid01123 commented Jun 6, 2022 • edited

erdogant commented Jun 9, 2022 • edited

erdogant commented Jun 9, 2022 • edited

Sid01123 commented Jun 9, 2022 via email

erdogant commented Aug 23, 2022

Sid01123 commented Jun 6, 2022 •

edited

erdogant commented Jun 9, 2022 •

edited

erdogant commented Jun 9, 2022 •

edited