Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it actually better to run file.exists() in a loop? #2204

Closed
MichaelChirico opened this issue Sep 21, 2023 · 5 comments
Closed

Is it actually better to run file.exists() in a loop? #2204

MichaelChirico opened this issue Sep 21, 2023 · 5 comments
Labels
internals Issues related to inner workings of lintr, i.e., not user-visible performance

Comments

@MichaelChirico
Copy link
Collaborator

lintr/R/settings_utils.R

Lines 78 to 82 in 945c2a2

for (loc in file_locations) {
if (file.exists(loc)) {
return(loc)
}
}

I think this is written under the assumption that typically the first/second entry will give the config, thus it's "not worth" running file.exists() for the full vector of possible config locations.

But file.exists() is not all that expensive, is it? And this code is not executed 100s of times? So maybe it's simpler to just write this code in terms of checking the first TRUE entry on the file.exists(<vector>) result.

Should check the result quantitatively first, though.

@MichaelChirico MichaelChirico added performance internals Issues related to inner workings of lintr, i.e., not user-visible labels Sep 21, 2023
@MichaelChirico
Copy link
Collaborator Author

Benchmark:

library(microbenchmark)

fake_files <- replicate(n = 10L, paste(sample(letters, 20L, TRUE), collapse = ""))
real_file <- tempfile()
file.create(real_file)

real_file_at =
  function(fake_files, pos) if (pos == 0L) append(fake_files, fake_files[1L]) else append(fake_files, real_file, after=pos-1L)
loop = function(locs) for (loc in locs) if (file.exists(loc)) return(loc)
vec = function(locs) {
  idx <- which(file.exists(locs))
  if (length(idx) > 0L) {
    locs[idx]
  }
}

for (ii in seq_along(fake_files)) {
  for (jj in 0:(ii + 1L)) {
    locs <- real_file_at(head(fake_files, ii), jj)
    message(ii + 1L, " total files, real found at ", jj)
    print(microbenchmark(times = 1e4, loop(locs), vec(locs)))
  }
}

output:

2 total files, real found at 0
Unit: microseconds
       expr    min      lq     mean  median     uq      max neval cld
 loop(locs) 17.870 28.1905 48.99391 36.9995 47.820 1077.819 10000  a 
  vec(locs) 19.071 29.6185 52.81980 38.8705 53.139  439.640 10000   b
2 total files, real found at 1
Unit: microseconds
       expr    min     lq      mean  median     uq    max neval cld
 loop(locs)  2.499  3.415  6.287821  4.8400  6.865 177.74 10000  a 
  vec(locs) 13.561 20.490 34.852762 27.3805 36.616 428.61 10000   b
2 total files, real found at 2
Unit: microseconds
       expr    min      lq     mean  median      uq      max neval cld
 loop(locs) 13.681 24.0755 52.63211 35.7360 63.1015 1383.490 10000  a 
  vec(locs) 11.911 27.1615 61.58512 41.7705 75.4215  656.321 10000   b
3 total files, real found at 0
Unit: microseconds
       expr    min      lq     mean  median      uq     max neval cld
 loop(locs) 26.830 43.9290 80.18354 56.7700 83.2440  803.21 10000  a 
  vec(locs) 26.859 44.7255 82.46350 57.6335 87.8535 2154.80 10000   b
3 total files, real found at 1
Unit: microseconds
       expr    min    lq      mean  median      uq   max neval cld
 loop(locs)  2.491  3.14  6.756296  4.5900  6.5205 212.4 10000  a 
  vec(locs) 22.239 32.07 56.625927 40.8405 57.6895 553.1 10000   b
3 total files, real found at 2
Unit: microseconds
       expr    min      lq     mean median      uq     max neval cld
 loop(locs) 11.412 18.1410 31.18859 23.280 29.5905 551.581 10000  a 
  vec(locs) 23.961 32.6295 56.69199 42.037 55.3215 428.200 10000   b
3 total files, real found at 3
Unit: microseconds
       expr    min     lq     mean  median      uq     max neval cld
 loop(locs) 23.139 32.509 57.67547 42.6935 56.4195 681.453 10000  a 
  vec(locs) 23.430 33.469 60.20279 43.3595 61.9840 550.089 10000   b
4 total files, real found at 0
Unit: microseconds
       expr    min      lq     mean  median     uq     max neval cld
 loop(locs) 37.160 55.9895 92.83122 72.5550 93.940 591.221 10000   a
  vec(locs) 34.099 55.3210 93.03410 71.3905 98.238 624.912 10000   a
4 total files, real found at 1
Unit: microseconds
       expr    min     lq      mean  median      uq     max neval cld
 loop(locs)  2.481  3.446  8.089403  4.9810  7.9105 262.301 10000  a 
  vec(locs) 28.001 46.622 85.400301 60.2205 91.2260 848.862 10000   b
4 total files, real found at 2
Unit: microseconds
       expr    min      lq     mean  median     uq     max neval cld
 loop(locs) 12.081 18.5600 32.53282 24.0100 31.075  384.16 10000  a 
  vec(locs) 30.410 45.3205 77.15903 57.7955 78.485 1196.94 10000   b
4 total files, real found at 3
Unit: microseconds
       expr   min      lq     mean  median      uq     max neval cld
 loop(locs) 22.55 32.7105 59.55014 42.7305 58.7850 486.769 10000  a 
  vec(locs) 29.52 46.4600 82.96790 59.8500 86.3055 862.731 10000   b
4 total files, real found at 4
Unit: microseconds
       expr    min      lq     mean  median      uq      max neval cld
 loop(locs) 34.480 46.6515 83.80528 61.2310 82.8765 1458.131 10000   a
  vec(locs) 29.231 47.2405 83.61660 61.0305 88.5160  532.922 10000   a
5 total files, real found at 0
Unit: microseconds
       expr    min     lq     mean  median       uq      max neval cld
 loop(locs) 45.891 69.635 110.8352 89.6495 112.9700 1138.851 10000   b
  vec(locs) 46.268 68.038 105.7484 85.8345 110.9055  589.731 10000  a 
5 total files, real found at 1
Unit: microseconds
       expr    min     lq      mean median      uq     max neval cld
 loop(locs)  2.491  3.170  7.812393  4.870   7.721 225.801 10000  a 
  vec(locs) 36.381 57.606 98.174317 73.574 105.260 794.259 10000   b
5 total files, real found at 2
Unit: microseconds
       expr   min      lq     mean median      uq      max neval cld
 loop(locs) 10.81 18.8845 34.08508  24.51  33.270  333.519 10000  a 
  vec(locs) 35.45 58.5115 99.94544  74.66 104.985 1637.042 10000   b
5 total files, real found at 3
Unit: microseconds
       expr    min      lq     mean  median      uq     max neval cld
 loop(locs) 22.000 32.5910 54.35514 42.8810  55.406 567.943 10000  a 
  vec(locs) 34.892 58.7055 95.85017 75.3765 102.421 506.341 10000   b
5 total files, real found at 4
Unit: microseconds
       expr    min      lq     mean  median       uq     max neval cld
 loop(locs) 30.391 53.1955 78.78881 64.5195  82.6755 510.901 10000  a 
  vec(locs) 33.861 69.1925 99.73531 81.6760 106.7680 759.291 10000   b
5 total files, real found at 5
Unit: microseconds
       expr    min      lq     mean  median       uq     max neval cld
 loop(locs) 42.337 61.5200 101.7741 79.7495 104.7480 602.910 10000   a
  vec(locs) 35.450 61.2745 101.4287 77.7145 107.6945 668.263 10000   a
6 total files, real found at 0
Unit: microseconds
       expr    min     lq     mean   median       uq     max neval cld
 loop(locs) 60.116 83.927 130.8284 106.5910 136.6250 813.184 10000   b
  vec(locs) 49.330 80.831 124.4718 101.3415 134.7935 651.199 10000  a 
6 total files, real found at 1
Unit: microseconds
       expr    min     lq       mean  median       uq     max neval cld
 loop(locs)  2.519  3.550   8.502554  5.1600   8.7245 262.230 10000  a 
  vec(locs) 40.870 72.985 122.045560 93.3805 135.1845 878.342 10000   b
6 total files, real found at 2
Unit: microseconds
       expr    min     lq      mean  median       uq      max neval cld
 loop(locs) 11.450 19.131  34.47291 25.0805  34.2045  401.080 10000  a 
  vec(locs) 49.242 72.375 121.51939 92.3990 132.3020 1665.832 10000   b
6 total files, real found at 3
Unit: microseconds
       expr    min      lq      mean  median       uq      max neval cld
 loop(locs) 21.431 33.9310  62.05001 44.5395  62.7865  544.962 10000  a 
  vec(locs) 45.861 76.6665 129.62014 96.4850 145.1880 1500.534 10000   b
6 total files, real found at 4
Unit: microseconds
       expr    min      lq      mean  median       uq     max neval cld
 loop(locs) 31.171 48.8255  82.87363 62.5845  85.5605 552.949 10000  a 
  vec(locs) 51.900 77.7525 125.30427 96.0765 136.7465 622.351 10000   b
6 total files, real found at 5
Unit: microseconds
       expr    min      lq     mean median      uq     max neval cld
 loop(locs) 39.090 62.1365 103.3511 80.316 107.392 663.350 10000  a 
  vec(locs) 46.243 75.4405 120.6737 95.095 133.111 571.163 10000   b
6 total files, real found at 6
Unit: microseconds
       expr    min     lq     mean median       uq      max neval cld
 loop(locs) 52.031 74.596 123.7087 95.231 128.8815 1012.595 10000   b
  vec(locs) 47.223 72.401 117.6038 91.594 126.4010  687.765 10000  a 
7 total files, real found at 0
Unit: microseconds
       expr   min     lq     mean  median       uq     max neval cld
 loop(locs) 70.35 99.622 154.2360 127.340 164.6610 927.593 10000   b
  vec(locs) 57.74 95.552 145.3851 120.189 160.6905 804.578 10000  a 
7 total files, real found at 1
Unit: microseconds
       expr   min      lq       mean  median      uq     max neval cld
 loop(locs)  2.51  3.2000   8.167044   5.080   8.630 207.880 10000  a 
  vec(locs) 45.06 82.9995 131.714752 104.955 142.969 986.641 10000   b
7 total files, real found at 2
Unit: microseconds
       expr    min      lq      mean   median      uq      max neval cld
 loop(locs) 13.301 18.7610  33.58254  24.2015  32.986  650.461 10000  a 
  vec(locs) 49.942 82.9265 133.78363 104.8375 146.855 1169.944 10000   b
7 total files, real found at 3
Unit: microseconds
       expr    min     lq     mean   median       uq     max neval cld
 loop(locs) 21.138 32.969  55.9710  42.9005  56.3930 447.748 10000  a 
  vec(locs) 56.081 85.089 131.9428 107.3900 140.9085 811.452 10000   b
7 total files, real found at 4
Unit: microseconds
       expr    min      lq      mean   median       uq     max neval cld
 loop(locs) 32.670 46.2400  78.98536  59.8655  81.1455 716.420 10000  a 
  vec(locs) 54.969 84.2855 135.97173 106.6100 149.8070 782.851 10000   b
7 total files, real found at 5
Unit: microseconds
       expr    min      lq     mean   median       uq     max neval cld
 loop(locs) 41.970 62.6145 109.2771  81.2980 114.7280 759.611 10000  a 
  vec(locs) 58.611 89.2170 147.7354 113.1395 167.4145 975.180 10000   b
7 total files, real found at 6
Unit: microseconds
       expr    min      lq     mean   median       uq     max neval cld
 loop(locs) 52.965 73.4930 112.7751  92.0055 115.1710 989.648 10000  a 
  vec(locs) 57.407 83.6985 127.1130 105.0060 135.3005 924.222 10000   b
7 total files, real found at 7
Unit: microseconds
       expr    min      lq     mean   median      uq      max neval cld
 loop(locs) 58.874 87.9805 137.5577 111.3755 144.111 1405.482 10000   b
  vec(locs) 53.674 84.8750 131.7938 107.1115 143.135  789.201 10000  a 
8 total files, real found at 0
Unit: microseconds
       expr    min       lq     mean  median       uq      max neval cld
 loop(locs) 76.679 113.2730 172.0508 144.805 184.3210 1361.529 10000   b
  vec(locs) 61.680 107.5105 158.8442 134.405 174.2765 1129.621 10000  a 
8 total files, real found at 1
Unit: microseconds
       expr    min      lq       mean   median       uq      max neval cld
 loop(locs)  2.500  3.3690   8.769578   5.2300   9.1595  299.830 10000  a 
  vec(locs) 62.652 97.5485 157.310377 123.1955 174.8530 1046.031 10000   b
8 total files, real found at 2
Unit: microseconds
       expr   min      lq      mean   median       uq     max neval cld
 loop(locs) 12.93 18.8695  33.90164  24.1300  33.6650 388.450 10000  a 
  vec(locs) 60.30 96.6440 151.72458 121.7105 167.2715 766.032 10000   b
8 total files, real found at 3
Unit: microseconds
       expr    min      lq      mean   median       uq      max neval cld
 loop(locs) 22.652  33.604  60.23582  43.6990  61.8390  788.991 10000  a 
  vec(locs) 56.820 100.295 160.67537 125.4705 179.6205 2675.983 10000   b
8 total files, real found at 4
Unit: microseconds
       expr    min     lq      mean   median       uq      max neval cld
 loop(locs) 31.381 46.412  80.21577  60.5375  82.9215  491.881 10000  a 
  vec(locs) 60.048 97.150 156.15434 123.4610 172.8950 1090.493 10000   b
8 total files, real found at 5
Unit: microseconds
       expr    min      lq     mean   median       uq      max neval cld
 loop(locs) 42.899  64.903 123.6422  84.4145 130.1285 1786.180 10000  a 
  vec(locs) 62.449 106.349 184.9606 135.4195 217.0555 1767.529 10000   b
8 total files, real found at 6
Unit: microseconds
       expr    min      lq     mean   median       uq      max neval cld
 loop(locs) 54.630 74.7615 125.1428  96.4805 129.0300  884.852 10000  a 
  vec(locs) 66.401 99.0105 157.5316 125.0855 172.2295 1175.100 10000   b
8 total files, real found at 7
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 62.651  92.1410 153.9715 118.2615 164.5725  971.703 10000  a 
  vec(locs) 58.911 103.1515 164.7046 130.2760 188.4175 1182.892 10000   b
8 total files, real found at 8
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 74.651 104.9605 173.7515 135.5065 186.1400 1093.276 10000   b
  vec(locs) 63.817 102.0215 164.4138 129.4380 184.3915  836.451 10000  a 
9 total files, real found at 0
Unit: microseconds
       expr    min       lq     mean   median      uq      max neval cld
 loop(locs) 89.114 130.7395 208.8363 167.2365 221.507 1732.234 10000   b
  vec(locs) 84.858 125.7055 190.6436 156.5780 211.944 1125.929 10000  a 
9 total files, real found at 1
Unit: microseconds
       expr   min      lq     mean   median       uq      max neval cld
 loop(locs)  2.50   3.490  10.0330   5.4600  11.0345  189.920 10000  a 
  vec(locs) 76.23 113.657 190.7197 145.6675 227.1260 1759.663 10000   b
9 total files, real found at 2
Unit: microseconds
       expr   min      lq      mean   median       uq      max neval cld
 loop(locs) 12.26  20.210  38.69672  26.4245  39.5395  440.930 10000  a 
  vec(locs) 70.62 114.705 188.03972 145.3975 214.7940 1316.392 10000   b
9 total files, real found at 3
Unit: microseconds
       expr    min      lq      mean   median       uq      max neval cld
 loop(locs) 21.680  33.890  63.79428  44.3515  66.0555  743.182 10000  a 
  vec(locs) 73.711 115.901 193.10961 146.7505 224.2530 1334.195 10000   b
9 total files, real found at 4
Unit: microseconds
       expr    min       lq     mean  median      uq      max neval cld
 loop(locs) 31.849  47.7600  85.6455  61.661  88.820 1066.143 10000  a 
  vec(locs) 78.051 114.1395 184.7729 145.230 215.429 1770.563 10000   b
9 total files, real found at 5
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 42.941  68.3870 114.2971  86.4250 122.9000  788.122 10000  a 
  vec(locs) 66.562 125.8325 192.7304 156.6015 223.3535 1178.172 10000   b
9 total files, real found at 6
Unit: microseconds
       expr   min       lq     mean   median       uq      max neval cld
 loop(locs) 52.83  82.8255 142.5620 106.7985 156.2305 1296.893 10000  a 
  vec(locs) 76.80 127.4205 201.7696 160.5535 239.4850 3434.493 10000   b
9 total files, real found at 7
Unit: microseconds
       expr    min      lq     mean  median       uq      max neval cld
 loop(locs) 61.129  92.080 157.1370 119.412 168.9855  964.710 10000  a 
  vec(locs) 67.401 118.292 187.4829 148.590 215.1030 1352.517 10000   b
9 total files, real found at 8
Unit: microseconds
       expr    min      lq     mean   median       uq      max neval cld
 loop(locs) 65.661 111.150 192.1274 143.1210 216.7155 1417.890 10000  a 
  vec(locs) 75.956 121.066 197.9968 154.0885 235.5515 1271.998 10000   b
9 total files, real found at 9
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 85.370 124.8820 209.9794 160.0455 235.2830 2549.833 10000   b
  vec(locs) 79.849 120.5785 196.3203 152.9670 233.2275 1396.779 10000  a 
10 total files, real found at 0
Unit: microseconds
       expr     min       lq     mean  median       uq      max neval cld
 loop(locs) 102.684 146.2065 228.4999 187.255 250.8625 1673.933 10000   b
  vec(locs)  93.564 138.9850 209.5967 173.772 240.4860 1074.144 10000  a 
10 total files, real found at 1
Unit: microseconds
       expr   min       lq       mean  median       uq      max neval cld
 loop(locs)  2.49   3.5010   9.942367   5.491  11.2655  194.230 10000  a 
  vec(locs) 75.01 127.6745 201.804136 162.750 236.7695 1115.951 10000   b
10 total files, real found at 2
Unit: microseconds
       expr    min      lq      mean   median      uq      max neval cld
 loop(locs) 11.151  19.492  36.86145  25.8210  38.376  408.303 10000  a 
  vec(locs) 67.051 125.116 196.41908 157.9205 227.235 2129.613 10000   b
10 total files, real found at 3
Unit: microseconds
       expr   min       lq      mean   median       uq      max neval cld
 loop(locs) 24.08  34.2740  64.20581  44.8505  66.8095 1792.873 10000  a 
  vec(locs) 75.16 128.7305 206.90379 163.9790 244.6855 1395.323 10000   b
10 total files, real found at 4
Unit: microseconds
       expr    min       lq      mean   median      uq      max neval cld
 loop(locs) 33.511  48.4560  88.46707  63.4000  91.965  842.059 10000  a 
  vec(locs) 75.521 128.3705 211.47564 164.7955 252.382 2109.791 10000   b
10 total files, real found at 5
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 40.311  62.1860 108.1344  81.1855 113.4005  735.512 10000  a 
  vec(locs) 76.281 128.3545 202.1772 162.3330 235.6045 1211.972 10000   b
10 total files, real found at 6
Unit: microseconds
       expr    min       lq     mean   median      uq      max neval cld
 loop(locs) 53.660  77.1565 133.5966 100.0520 142.805  928.441 10000  a 
  vec(locs) 83.011 130.9260 208.0524 165.9215 247.098 1206.363 10000   b
10 total files, real found at 7
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 59.769 106.2700 159.5447 128.1195 174.2310 2514.313 10000  a 
  vec(locs) 82.278 146.9465 211.1607 176.4505 239.0265 1880.582 10000   b
10 total files, real found at 8
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 75.183 103.7705 172.4498 134.7515 190.3160 1076.172 10000  a 
  vec(locs) 88.171 126.9165 198.4967 161.7920 228.6555 1799.829 10000   b
10 total files, real found at 9
Unit: microseconds
       expr    min      lq     mean   median       uq      max neval cld
 loop(locs) 86.370 122.634 200.4433 157.8365 222.1445 1337.050 10000  a 
  vec(locs) 81.501 130.974 204.6119 164.6360 234.2560 1203.941 10000   b
10 total files, real found at 10
Unit: microseconds
       expr    min       lq     mean   median      uq       max neval cld
 loop(locs) 91.060 135.3200 223.3699 175.8010 249.980  1445.643 10000   b
  vec(locs) 81.951 130.3565 207.5731 165.2895 240.379 12500.051 10000  a 
11 total files, real found at 0
Unit: microseconds
       expr     min       lq     mean   median       uq       max neval cld
 loop(locs) 113.621 159.4995 251.9873 204.8455 281.3155  1506.049 10000   b
  vec(locs) 100.650 150.2445 226.8992 188.4595 258.2795 13035.959 10000  a 
11 total files, real found at 1
Unit: microseconds
       expr    min       lq      mean   median       uq      max neval cld
 loop(locs)  2.501   3.5000  10.27878   5.6710  12.0015  317.011 10000  a 
  vec(locs) 84.581 142.6225 223.90127 180.9305 267.8265 1013.650 10000   b
11 total files, real found at 2
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 13.179  19.6690  37.2093  25.8145  39.5895  324.179 10000  a 
  vec(locs) 86.749 139.4285 215.9513 175.5440 252.8505 1585.153 10000   b
11 total files, real found at 3
Unit: microseconds
       expr    min      lq      mean   median       uq      max neval cld
 loop(locs) 22.031  34.676  62.55068  44.8410  65.8265  493.571 10000  a 
  vec(locs) 93.513 144.786 224.68801 182.1255 266.1405 1133.433 10000   b
11 total files, real found at 4
Unit: microseconds
       expr    min       lq     mean   median      uq      max neval cld
 loop(locs) 34.348  51.6205 102.8904  69.9365 117.112 1181.782 10000  a 
  vec(locs) 79.088 153.3130 259.5883 201.5985 317.711 1394.995 10000   b
11 total files, real found at 5
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 42.971  65.4810 120.5220  85.8710 133.1665  805.308 10000  a 
  vec(locs) 77.657 149.5515 243.2025 189.3335 292.9315 1446.559 10000   b
11 total files, real found at 6
Unit: microseconds
       expr    min       lq     mean  median       uq      max neval cld
 loop(locs) 54.075  81.0440 137.4031 103.821 148.3190  950.243 10000  a 
  vec(locs) 93.118 151.7065 231.5220 187.700 272.1965 1175.661 10000   b
11 total files, real found at 7
Unit: microseconds
       expr    min      lq     mean   median       uq      max neval cld
 loop(locs) 62.388  91.800 155.5682 118.9210 167.5855 1425.481 10000  a 
  vec(locs) 94.401 144.235 223.5070 182.3255 261.0105  989.981 10000   b
11 total files, real found at 8
Unit: microseconds
       expr    min       lq     mean   median      uq      max neval cld
 loop(locs) 73.721 133.1145 188.8862 155.0005 206.191  959.812 10000  a 
  vec(locs) 93.739 171.8000 239.4300 204.1710 276.805 2056.110 10000   b
11 total files, real found at 9
Unit: microseconds
       expr    min       lq     mean  median       uq      max neval cld
 loop(locs) 87.359 151.8945 208.4412 174.495 227.8760 1055.641 10000  a 
  vec(locs) 74.271 172.4880 232.6865 200.836 261.3515  991.664 10000   b
11 total files, real found at 10
Unit: microseconds
       expr    min       lq     mean   median       uq       max neval cld
 loop(locs) 94.080 139.5465 223.6669 176.7290 245.6690 13231.167 10000   a
  vec(locs) 92.316 147.7120 222.2848 182.1805 254.5925  1081.023 10000   a
11 total files, real found at 11
Unit: microseconds
       expr     min       lq     mean  median       uq       max neval cld
 loop(locs) 106.131 154.6155 247.3232 196.261 278.5225 13437.881 10000   b
  vec(locs)  88.701 148.5615 223.0915 183.291 259.8085  1018.201 10000  a 
```

So the `for` approach is a clear winner, probably owing to branch prediction needed for `if (length(idx) > 0L)`.

@MichaelChirico
Copy link
Collaborator Author

We could rework the rest of the code to allow the vec() approach to return NA instead of NULL & eliminate the branch prediction (at least in this function; the consuming function still needs an if() branch, but it already has one checking if (!is.null). Here's the results in that case:

vec = function(locs) locs[which(file.exists(locs))[1L]]
2 total files, real found at 0
Unit: microseconds
       expr    min      lq     mean  median      uq      max neval cld
 loop(locs) 16.761 37.3715 53.40856 42.7850 55.2600 2425.264 10000  a 
  vec(locs) 20.782 39.9155 59.52739 46.5005 61.1805 3014.215 10000   b
2 total files, real found at 1
Unit: microseconds
       expr    min     lq      mean median     uq     max neval cld
 loop(locs)  2.501  4.041  7.068531   5.15  7.060 319.969 10000  a 
  vec(locs) 13.870 23.490 37.179534  27.19 36.675 360.231 10000   b
2 total files, real found at 2
Unit: microseconds
       expr    min     lq     mean median     uq     max neval cld
 loop(locs) 12.031 24.071 33.43134 27.051 33.307 393.952 10000  a 
  vec(locs) 11.981 26.292 37.24806 29.741 37.111 470.122 10000   b
3 total files, real found at 0
Unit: microseconds
       expr   min      lq     mean median      uq     max neval cld
 loop(locs) 27.91 50.8645 75.78936 59.970 80.0510 683.802 10000  a 
  vec(locs) 27.10 52.2590 77.99660 61.625 84.2555 494.849 10000   b
3 total files, real found at 1
Unit: microseconds
       expr    min     lq     mean median     uq     max neval cld
 loop(locs)  2.501  4.596  7.28053  5.451  7.546 269.432 10000  a 
  vec(locs) 22.181 39.892 55.20566 45.102 56.556 534.621 10000   b
3 total files, real found at 2
Unit: microseconds
       expr    min      lq     mean  median      uq     max neval cld
 loop(locs) 12.381 21.7915 31.70452 25.7915 31.2915 352.992 10000  a 
  vec(locs) 24.401 38.7010 54.30225 44.7025 55.1260 489.752 10000   b
3 total files, real found at 3
Unit: microseconds
       expr    min      lq     mean median     uq     max neval cld
 loop(locs) 22.680 35.9260 52.21908 44.611 53.001 625.011 10000  a 
  vec(locs) 23.271 37.1045 53.21847 44.991 54.390 475.250 10000   b
4 total files, real found at 0
Unit: microseconds
       expr    min      lq     mean  median      uq     max neval cld
 loop(locs) 37.288 54.4055 85.65664 66.8815 87.8325  656.84 10000   a
  vec(locs) 33.841 53.9910 85.05811 67.2660 88.6825 1009.24 10000   a
4 total files, real found at 1
Unit: microseconds
       expr    min      lq      mean  median      uq     max neval cld
 loop(locs)  2.499  3.3900  7.107401  4.9395  6.9605 430.001 10000  a 
  vec(locs) 30.690 43.8605 73.461519 57.3350 74.5195 613.010 10000   b
4 total files, real found at 2
Unit: microseconds
       expr    min    lq     mean  median    uq      max neval cld
 loop(locs) 12.640 18.78 33.03697 24.5450 31.76  500.081 10000  a 
  vec(locs) 30.531 45.90 77.58347 58.5105 77.72 4604.585 10000   b
4 total files, real found at 3
Unit: microseconds
       expr    min      lq     mean median     uq     max neval cld
 loop(locs) 21.862 32.2500 55.22287 40.405 54.124 509.332 10000  a 
  vec(locs) 29.950 45.0445 74.80565 56.570 76.870 500.560 10000   b
4 total files, real found at 4
Unit: microseconds
       expr    min     lq     mean  median      uq      max neval cld
 loop(locs) 32.561 45.191 75.53145 56.3605 74.5165 1220.342 10000   a
  vec(locs) 30.821 44.667 74.31819 56.7110 75.7015  754.422 10000   a
5 total files, real found at 0
Unit: microseconds
       expr    min     lq     mean  median       uq     max neval cld
 loop(locs) 48.040 69.871 113.7842 90.4265 117.5565 843.450 10000   b
  vec(locs) 45.541 68.791 111.1404 87.6115 118.7705 617.833 10000  a 
5 total files, real found at 1
Unit: microseconds
       expr    min      lq      mean  median     uq    max neval cld
 loop(locs)  2.510  3.2990  7.518837  4.9590  7.750 211.08 10000  a 
  vec(locs) 35.599 57.3455 90.304520 72.9795 93.215 538.44 10000   b
5 total files, real found at 2
Unit: microseconds
       expr    min     lq     mean  median       uq     max neval cld
 loop(locs) 10.391 18.881 33.07855 24.9710  33.0015 340.952 10000  a 
  vec(locs) 36.022 58.631 95.87063 75.3465 102.4770 786.600 10000   b
5 total files, real found at 3
Unit: microseconds
       expr    min      lq     mean  median     uq     max neval cld
 loop(locs) 23.720 31.6595 51.54894 40.6145 50.794 550.490 10000  a 
  vec(locs) 35.321 56.9590 88.79938 71.9810 90.895 570.543 10000   b
5 total files, real found at 4
Unit: microseconds
       expr    min      lq     mean  median      uq      max neval cld
 loop(locs) 32.330 45.4645 74.80097 57.3685 74.0510  601.741 10000  a 
  vec(locs) 37.011 57.3195 94.34300 73.0500 97.1255 1194.601 10000   b
5 total files, real found at 5
Unit: microseconds
       expr    min    lq     mean  median       uq     max neval cld
 loop(locs) 39.920 60.93 102.9944 79.2450 106.9800 850.111 10000   b
  vec(locs) 37.369 59.54 100.2271 76.0405 107.1795 769.660 10000  a 
6 total files, real found at 0
Unit: microseconds
       expr   min      lq     mean   median       uq      max neval cld
 loop(locs) 57.15 85.1320 139.9256 110.4120 149.0065 1145.071 10000   b
  vec(locs) 58.49 82.8895 132.5187 105.5525 147.1005  735.391 10000  a 
6 total files, real found at 1
Unit: microseconds
       expr    min      lq     mean median       uq     max neval cld
 loop(locs)  2.510  3.1300   8.1014  4.875   8.0300 140.759 10000  a 
  vec(locs) 41.044 68.4545 114.0713 86.090 123.2395 959.899 10000   b
6 total files, real found at 2
Unit: microseconds
       expr   min      lq      mean  median       uq     max neval cld
 loop(locs) 13.42 19.0185  36.14156 25.0690  35.7555 357.190 10000  a 
  vec(locs) 45.84 71.3800 121.05920 91.4895 132.1550 712.793 10000   b
6 total files, real found at 3
Unit: microseconds
       expr    min      lq      mean  median      uq     max neval cld
 loop(locs) 23.091 32.9395  61.14991 43.0450  62.205 466.990 10000  a 
  vec(locs) 46.580 71.4290 124.80068 91.9595 139.254 725.912 10000   b
6 total files, real found at 4
Unit: microseconds
       expr    min      lq      mean median      uq     max neval cld
 loop(locs) 33.209 46.0485  75.72664 58.905  76.364 500.871 10000  a 
  vec(locs) 48.792 70.5840 110.61468 88.626 117.374 731.551 10000   b
6 total files, real found at 5
Unit: microseconds
       expr    min      lq     mean  median       uq      max neval cld
 loop(locs) 41.471 62.1005 105.4868 80.0410 109.2105 1034.909 10000  a 
  vec(locs) 47.131 74.2065 122.4426 94.0055 136.1255  708.053 10000   b
6 total files, real found at 6
Unit: microseconds
       expr    min      lq     mean  median       uq     max neval cld
 loop(locs) 53.301 78.8395 128.3213 101.095 134.8705 745.071 10000   b
  vec(locs) 43.840 76.9440 120.2694  95.920 130.8700 820.752 10000  a 
7 total files, real found at 0
Unit: microseconds
       expr   min       lq     mean   median      uq      max neval cld
 loop(locs) 64.19 112.0060 175.9034 138.1115 186.787 3298.805 10000   b
  vec(locs) 54.73 109.9565 165.1016 132.5070 183.915 1196.541 10000  a 
7 total files, real found at 1
Unit: microseconds
       expr    min      lq       mean  median      uq     max neval cld
 loop(locs)  2.481  3.4310   8.977981   5.231   9.626 189.711 10000  a 
  vec(locs) 50.252 85.8145 139.941006 109.430 156.901 896.649 10000   b
7 total files, real found at 2
Unit: microseconds
       expr   min     lq      mean   median      uq     max neval cld
 loop(locs) 11.76 19.470  36.32068  25.7295  35.565 514.199 10000  a 
  vec(locs) 52.70 86.349 139.81899 108.6205 154.176 746.142 10000   b
7 total files, real found at 3
Unit: microseconds
       expr    min      lq      mean   median       uq      max neval cld
 loop(locs) 21.601 33.6810  61.44753  44.0755  62.3165 1294.380 10000  a 
  vec(locs) 54.506 87.0355 144.47723 110.7855 165.5175  796.942 10000   b
7 total files, real found at 4
Unit: microseconds
       expr    min      lq      mean   median       uq     max neval cld
 loop(locs) 33.682 48.8010  86.46101  63.2115  88.9255 619.120 10000  a 
  vec(locs) 51.931 90.4505 147.34809 113.3610 166.4245 913.003 10000   b
7 total files, real found at 5
Unit: microseconds
       expr    min      lq     mean   median       uq      max neval cld
 loop(locs) 42.209 62.1950 108.0909  80.7855 111.8145  713.572 10000  a 
  vec(locs) 51.139 88.4045 144.8659 112.0620 162.5485 1661.022 10000   b
7 total files, real found at 6
Unit: microseconds
       expr    min     lq     mean   median       uq      max neval cld
 loop(locs) 51.892 79.617 141.3881 103.0955 154.0865 1504.153 10000  a 
  vec(locs) 54.361 91.706 153.2543 116.8315 176.4770  974.972 10000   b
7 total files, real found at 7
Unit: microseconds
       expr    min     lq     mean   median       uq      max neval cld
 loop(locs) 64.141 95.932 157.0550 122.0515 167.4695  844.579 10000   b
  vec(locs) 57.901 92.130 145.7694 115.2650 162.7060 1106.655 10000  a 
8 total files, real found at 0
Unit: microseconds
       expr    min       lq     mean  median       uq      max neval cld
 loop(locs) 78.731 117.5960 192.8229 151.333 209.2310 1307.022 10000   b
  vec(locs) 63.011 115.3355 179.7590 144.311 203.1725 1130.733 10000  a 
8 total files, real found at 1
Unit: microseconds
       expr    min       lq      mean  median       uq     max neval cld
 loop(locs)  2.520   4.7455  10.81175   6.321  11.9710 373.161 10000  a 
  vec(locs) 57.321 116.6260 177.29582 142.441 206.5225 925.356 10000   b
8 total files, real found at 2
Unit: microseconds
       expr    min       lq      mean   median       uq      max neval cld
 loop(locs) 11.072  19.7670  36.85665  25.9165  37.7620  469.331 10000  a 
  vec(locs) 64.723 100.4715 160.57480 126.3890 178.2715 1243.082 10000   b
8 total files, real found at 3
Unit: microseconds
       expr    min       lq      mean   median       uq      max neval cld
 loop(locs) 22.291  37.8710  78.36989  49.1460  85.5560 1171.572 10000  a 
  vec(locs) 55.459 114.9595 202.72105 146.0665 251.7545 2463.226 10000   b
8 total files, real found at 4
Unit: microseconds
       expr    min      lq      mean   median       uq      max neval cld
 loop(locs) 31.551 47.3405  84.16831  61.8015  85.4205  644.761 10000  a 
  vec(locs) 60.113 98.6370 159.84481 125.6265 176.9600 1110.933 10000   b
8 total files, real found at 5
Unit: microseconds
       expr    min      lq     mean  median       uq      max neval cld
 loop(locs) 43.531  62.661 112.5555  82.191 119.0455 1410.729 10000  a 
  vec(locs) 58.352 102.662 168.9372 129.596 191.5060 1418.071 10000   b
8 total files, real found at 6
Unit: microseconds
       expr    min       lq     mean   median      uq     max neval cld
 loop(locs) 50.759  81.8895 125.9041 102.7205 134.019 728.051 10000  a 
  vec(locs) 55.120 108.7650 156.8111 132.4000 173.684 794.511 10000   b
8 total files, real found at 7
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 61.592  95.2595 155.8516 121.0805 163.9995  925.511 10000  a 
  vec(locs) 69.957 106.3910 165.9426 131.6160 186.1150 1166.468 10000   b
8 total files, real found at 8
Unit: microseconds
       expr    min      lq     mean   median       uq      max neval cld
 loop(locs) 76.391 107.135 176.8749 137.1155 192.0245 1521.530 10000   b
  vec(locs) 60.690 102.310 164.5292 129.7000 186.3165 1132.853 10000  a 
9 total files, real found at 0
Unit: microseconds
       expr    min      lq     mean  median      uq      max neval cld
 loop(locs) 93.763 136.968 225.3947 174.411 250.296 1446.714 10000   b
  vec(locs) 71.641 133.184 205.0957 163.727 233.087 2123.004 10000  a 
9 total files, real found at 1
Unit: microseconds
       expr    min       lq      mean   median       uq     max neval cld
 loop(locs)  2.491   4.0965  10.32509   5.6410  11.5605 269.412 10000  a 
  vec(locs) 76.331 117.9965 190.89297 147.9125 223.2825 806.500 10000   b
9 total files, real found at 2
Unit: microseconds
       expr    min       lq      mean   median       uq     max neval cld
 loop(locs) 11.221  19.2705  35.40271  25.5505  35.7920 406.263 10000  a 
  vec(locs) 66.733 112.0855 173.27055 141.1910 194.4895 872.747 10000   b
9 total files, real found at 3
Unit: microseconds
       expr    min       lq      mean   median       uq     max neval cld
 loop(locs) 23.480  34.2095  58.80281  44.5205  60.7345 462.991 10000  a 
  vec(locs) 74.001 114.2955 175.35795 143.3295 193.9090 996.732 10000   b
9 total files, real found at 4
Unit: microseconds
       expr    min      lq      mean   median      uq      max neval cld
 loop(locs) 30.911  47.609  81.46682  61.7295  83.570  694.771 10000  a 
  vec(locs) 64.860 113.380 177.20811 142.4900 196.491 1409.691 10000   b
9 total files, real found at 5
Unit: microseconds
       expr    min       lq     mean   median       uq     max neval cld
 loop(locs) 43.982  59.5810 101.5602  76.2420 104.7665  688.13 10000  a 
  vec(locs) 74.091 109.0345 172.0624 137.5955 192.3300 1144.95 10000   b
9 total files, real found at 6
Unit: microseconds
       expr    min       lq     mean  median      uq     max neval cld
 loop(locs) 49.681  75.7010 126.4540  98.681 133.175 779.402 10000  a 
  vec(locs) 76.732 112.5155 174.5577 141.996 195.111 814.733 10000   b
9 total files, real found at 7
Unit: microseconds
       expr    min      lq     mean   median       uq      max neval cld
 loop(locs) 64.221  91.376 158.2050 118.7125 171.3455 1225.353 10000  a 
  vec(locs) 71.821 114.078 186.4487 145.9505 213.5485 1175.594 10000   b
9 total files, real found at 8
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 73.891 104.5550 172.8278 135.9875 186.8610 1223.389 10000  a 
  vec(locs) 65.941 113.2955 178.9752 143.6430 202.3195 1248.674 10000   b
9 total files, real found at 9
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 83.290 119.3900 191.6754 155.9320 208.4945 1083.891 10000   b
  vec(locs) 72.231 114.0065 178.3682 145.5215 203.0935 1114.512 10000  a 
10 total files, real found at 0
Unit: microseconds
       expr     min      lq     mean   median      uq      max neval cld
 loop(locs) 101.100 150.954 251.7012 195.8115 287.066 15461.17 10000   b
  vec(locs)  91.509 145.674 225.9473 184.0140 267.009  1032.23 10000  a 
10 total files, real found at 1
Unit: microseconds
       expr   min       lq      mean   median       uq      max neval cld
 loop(locs)  2.52   4.2200  11.63648   5.9605  13.6800  620.610 10000  a 
  vec(locs) 81.59 138.2515 219.69317 171.5790 265.3955 1546.269 10000   b
10 total files, real found at 2
Unit: microseconds
       expr    min       lq      mean   median       uq     max neval cld
 loop(locs) 11.972  19.1910  35.64845  25.0810  35.8700 369.771 10000  a 
  vec(locs) 73.513 121.4805 186.84957 152.9665 207.7015 920.172 10000   b
10 total files, real found at 3
Unit: microseconds
       expr    min       lq     mean   median      uq      max neval cld
 loop(locs) 23.110  34.8800  63.4901  45.5350  65.365  696.250 10000  a 
  vec(locs) 81.851 129.7855 205.0357 163.0525 235.984 1279.019 10000   b
10 total files, real found at 4
Unit: microseconds
       expr    min      lq      mean   median       uq      max neval cld
 loop(locs) 29.852  47.265  78.96328  61.0960  80.7945  558.491 10000  a 
  vec(locs) 87.912 125.565 188.14244 157.5415 209.7290 1396.780 10000   b
10 total files, real found at 5
Unit: microseconds
       expr    min       lq     mean   median      uq      max neval cld
 loop(locs) 44.680  63.1145 110.4293  82.1610 116.550  996.111 10000  a 
  vec(locs) 82.939 130.0520 201.9806 163.0035 235.151 1579.250 10000   b
10 total files, real found at 6
Unit: microseconds
       expr   min       lq     mean   median       uq      max neval cld
 loop(locs) 50.98  78.1715 129.2693 100.7050 135.5565 1368.653 10000  a 
  vec(locs) 85.94 129.9905 199.3372 163.2705 224.0715 2380.635 10000   b
10 total files, real found at 7
Unit: microseconds
       expr    min       lq     mean   median      uq      max neval cld
 loop(locs) 60.871  92.7505 149.9677 119.2960 158.807  858.911 10000  a 
  vec(locs) 87.908 128.9905 195.9447 160.5365 218.536 1362.421 10000   b
10 total files, real found at 8
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 75.550 110.1450 184.7645 141.4195 202.0715 1032.571 10000  a 
  vec(locs) 88.621 132.3455 207.2783 166.0605 239.8825 1130.729 10000   b
10 total files, real found at 9
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 86.560 121.7795 205.4248 158.0550 223.6305 1557.151 10000   a
  vec(locs) 80.522 131.1065 208.2254 166.3755 240.5045 1007.602 10000   a
10 total files, real found at 10
Unit: microseconds
       expr    min       lq     mean   median       uq       max neval cld
 loop(locs) 86.791 136.6975 223.8847 176.0035 245.3510 13029.740 10000   b
  vec(locs) 84.731 129.0220 200.9690 163.1715 230.6565  2580.405 10000  a 
11 total files, real found at 0
Unit: microseconds
       expr     min       lq     mean   median       uq       max neval cld
 loop(locs) 110.629 172.8010 256.1458 214.6470 278.1760 13335.853 10000   b
  vec(locs) 105.321 163.8725 230.9527 198.8825 260.1505  1366.924 10000  a 
11 total files, real found at 1
Unit: microseconds
       expr    min       lq      mean   median      uq      max neval cld
 loop(locs)  2.511   3.7300  11.02105   5.7510  13.171  195.091 10000  a 
  vec(locs) 96.192 143.6125 233.00604 183.8475 283.776 1677.791 10000   b
11 total files, real found at 2
Unit: microseconds
       expr    min       lq      mean  median       uq      max neval cld
 loop(locs) 12.563  19.7310  38.03963  26.231  40.2015  428.851 10000  a 
  vec(locs) 79.431 139.8865 218.38360 176.178 255.8270 1614.834 10000   b
11 total files, real found at 3
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 21.771  35.5405  66.9752  46.0750  71.2610  666.420 10000  a 
  vec(locs) 97.150 145.5665 232.2503 182.4865 278.1855 1194.731 10000   b
11 total files, real found at 4
Unit: microseconds
       expr    min     lq      mean   median       uq     max neval cld
 loop(locs) 29.190  49.08  86.93135  63.8500  90.8105 616.683 10000  a 
  vec(locs) 88.977 143.80 223.22978 179.6055 261.3805 998.723 10000   b
11 total files, real found at 5
Unit: microseconds
       expr    min      lq     mean   median      uq      max neval cld
 loop(locs) 40.801  65.171 113.6301  84.4005 122.455  867.122 10000  a 
  vec(locs) 91.182 146.306 227.0696 184.0720 264.641 1046.274 10000   b
11 total files, real found at 6
Unit: microseconds
       expr    min       lq     mean  median      uq      max neval cld
 loop(locs) 52.861  78.1765 139.7396 101.491 151.512 2169.480 10000  a 
  vec(locs) 93.422 144.3660 230.1626 182.930 272.270 1471.942 10000   b
11 total files, real found at 7
Unit: microseconds
       expr    min       lq     mean   median       uq      max neval cld
 loop(locs) 62.540  92.0405 157.7746 119.2515 170.8665 1503.162 10000  a 
  vec(locs) 97.081 143.5005 226.6952 181.5910 268.6525 1014.763 10000   b
11 total files, real found at 8
Unit: microseconds
       expr   min       lq     mean   median       uq      max neval cld
 loop(locs) 74.82 107.7445 185.4375 139.7900 205.3905 1451.880 10000  a 
  vec(locs) 96.02 145.3605 229.3997 183.4115 272.4065 1164.991 10000   b
11 total files, real found at 9
Unit: microseconds
       expr    min      lq     mean   median       uq      max neval cld
 loop(locs) 86.341 122.759 205.1270 158.9915 227.5955 1195.851 10000  a 
  vec(locs) 94.560 144.671 227.1433 183.3280 267.4695 1110.570 10000   b
11 total files, real found at 10
Unit: microseconds
       expr    min       lq     mean   median       uq       max neval cld
 loop(locs) 93.643 159.1995 242.2441 192.7795 277.7655  1231.191 10000   a
  vec(locs) 85.930 165.2375 244.0546 199.9255 283.0320 13871.865 10000   a
11 total files, real found at 11
Unit: microseconds
       expr     min       lq     mean   median       uq      max neval cld
 loop(locs) 107.982 152.3290 249.7597 195.1235 284.7005 14105.58 10000   b
  vec(locs)  98.900 144.3765 225.3624 182.1285 264.7345  2182.69 10000  a 

@MichaelChirico
Copy link
Collaborator Author

The gap shrunk, but the for approach still dominates, especially if the config will typically be found at the first index. We'll keep the for loop.

@MichaelChirico
Copy link
Collaborator Author

(it may be that file.exists() is faster/slower with certain architectures/OS, so feel free to run this benchmark on your own computer & compare)

@AshesITR
Copy link
Collaborator

Thanks for the thorough comparison.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internals Issues related to inner workings of lintr, i.e., not user-visible performance
Projects
None yet
Development

No branches or pull requests

2 participants